######### # Day 1 # ######### E1A: How many subdirectories are present within the directory called "E_coli"? 2 subdirectories. You dirctories should have a different color than the files. There are several ways of solving this, displayed below are two solutions. They both involve you being in the BacterialData folder. 1. Here we go in the folder and see its contents cd E_coli ls 2. Here we check the contents of the subfolder without needing to enter it ls E_coli E1B: In which subdirectory can the file "E_coli_CP127297.fasta" be found? The assemblies subdirectory in E_coli. E_coli/assemblies E1C: How many files are in the "Virus" subdirectory? ls Virus 3 E1D: What is the full path for the file named "R1_E_coli_02.fq"? You can get the path using this command in a dicrectory where the file is: readlink -f R1_E_coli_02.fq ------------------------------------------------------------------------------------ E2A: Move the assembly file "E_coli_WWcol315.fasta" into the "assemblies" sub-directory within the "E_coli" directory. In BacterialData: mv E_coli_WWcol315.fasta E_coli/assemblies E2B: Move the read files "R1_E_coli_01.fq" and "R2_E_coli_01.fq" into the "reads" sub-directory within the "E_coli" directory. In BacterialData: mv R1_E_coli_01.fq E_coli/reads mv R2_E_coli_01.fq E_coli/reads E2C: In the "BacteriaData" directory, create a subdirectory named "P_aeruginosa". mkdir P_aeruginosa E2D: In the "P_aeruginosa" directory, make a subdirectory called "reads". If you're in BacterialData: mkdir P_aeruginosa/reads If you're in P_aeruginosa: mkdir reads E2E: In the "P_aeruginosa" directory, make a subdirectory called "assemblies". mkdir P_aeruginosa/assemblies E2F: Move all read files into P_aeruginosa/reads using only a single command. The only read files thare are left, are those from P_aeruginosa. We can therefore move all 'fq' files in one go, using a wildcard: mv *.fq P_aeruginosa/reads E2G: Rename "P_oeruginosa_PPF1.fasta" to "P_aeruginosa_PPF1.fasta" mv P_oeruginosa_PPF1.fasta P_aeruginosa_PPF1.fasta E2H: Move all assembly files into P_aeruginosa/assemblies. Try to do it with a single command. The only fasta files left in this folder, is P_aeruginosa. We can therefore move all fasta files at once. mv *.fasta P_aeruginosa/assemblies ----------------------------------------------------------------------------------- E3A: Remove the "Credit_cards.txt" file from "BacteriaData" (nothing to see here, we promise, delete!). rm Credit_cards.txt E3B: Remove the "Virus" directory and everything in it. We are only working with bacteria in this lab. rm -rf Virus ----------------------------------------------------------------------------------- E4A: Make a new subdirectory called "ResFinder" inside "BacteriaData". mkdir Resfinder E4B: Copy the assembly file named "E_coli_ASM584v2_reference.fasta" from E_coli/assemblies and place the copy in the "ResFinder" directory. From the BacterialData folder: cp E_coli/assemblies/E_coli_ASM584v2_reference.fasta Resfinder E4C: Create a symbolic link to "E_coli_ASM584v2_reference.fasta" from the E_coli/assemblies directory in the "ResFinder" directory. ln -s cannot give name a file a name that's already used in the same folder. An idea woul to either name the symbolic link something different or delete the prior file. Here we are going to delete the prior file, as keeping the same name will insure that we avoid future confusion. rm E_coli_ASM584v2_reference.fasta ln -s E_coli/assemblies/E_coli_ASM584v2_reference.fasta Resfinder/ E4D: Create symbolic links to all assembly files in the "E_coli" and "P_aeruginosa" directories within the "ResFinder" directory. ln -sr E_coli/assemblies/* Resfinder/ ln -sr P_aeruginosa/assemblies/* Resfinder/ You will get an error that says "E_coli_ASM584v2_reference.fasta: File exists" There is no worry, as we already have a symbolic link to that file, and the rest will still be granted symbolic links ----------------------------------------------------------------------------------- E5A: Inspect the assembly file "P_aeruginosa_TOprJ3-positive_part1.fasta" using the cat command. In BacterialData/P_aeruginosa/assemblies: cat P_aeruginosa_TOprJ3-positive_part1.fasta E5B: Inspect the assembly file "P_aeruginosa_TOprJ3-positive_part2.fasta" using the less command. Observe the difference between cat and less. Which would you use for a large text file? In BacterialData/P_aeruginosa/assemblies: less P_aeruginosa_TOprJ3-positive_part2.fasta For larger text, less. As it doesn't clutter the terminal, and you can scroll through the text more easily. E5C: Get the first three lines of "P_aeruginosa_TOprJ3-positive_part2.fasta" using the head command. head -n 3 P_aeruginosa_TOprJ3-positive_part2.fasta E5D: Get the last five lines of "P_aeruginosa_TOprJ3-positive_part2.fasta" using the tail command. tail -n 5 P_aeruginosa_TOprJ3-positive_part2.fasta E5E: Open "P_aeruginosa_TOprJ3-positive_part2.fasta" using the nano command and add the missing ">" to the fasta header. nano P_aeruginosa_TOprJ3-positive_part2.fasta Add > to the faster header. On the first line, right before lcl with no spaces, like this: >lcl (and then rest of the line as is) press "ctrl + x" to exit, and press "y" to save. ----------------------------------------------------------------------------------- Extra exercise: Options to ls! ls -lh