User Tools

Site Tools


tutorials:bash_scripting:part1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tutorials:bash_scripting:part1 [2012/02/24 16:12] rmilestutorials:bash_scripting:part1 [2017/10/12 21:58] (current) – external edit 127.0.0.1
Line 1: Line 1:
-**2. Converting a scanned document to a pdf document**+//**1. Converting a scanned document to a pdf document**//
  
-Last year I did some consulting for a law firm that required me to submit timesheets with my invoices. In any given invoice period I would undertake work involving multiple clients. Work undertaken for each client was broken down into standard categories, telephone call, email, meeting, etc. +---- 
 +Last year I did some consulting for a law firm that required me to submit time sheets with my invoices. In any given invoice period I would undertake work involving multiple clients. Work undertaken for each client was broken down into standard categories, telephone call, email, meeting, etc. 
  
-I was working from my own home and my first inclination was to record everything on a spreadsheet formatted to look like the log however this was a bit cumbersome and I found that it was much simpler to just keep a log as was done in the office on the side of my desk or in my diary and pen in entries as necessary.+I was working from my own home and my first inclination was to record everything on a spreadsheet formatted to look like the log however this was a bit cumbersome and I found that it was much simpler to just keep a log on the side of my deskor in my diaryand pen in entries as necessary.
  
 The law firm filed everything as pdf files so I had to submit my log forms via email as a pdf documents. The law firm filed everything as pdf files so I had to submit my log forms via email as a pdf documents.
  
-To make life easy a wrote the following script to convert the scanned log forms from an image to a pdf. I used xsane set to lineart for scanning and saved as either .jpg or pmg which resulted in an image just about the same width and height as an A4 document. With xsane set to lineart and 300 dpi, the pdf files are around 93.5kB. +To make life easy a wrote the following script to convert the scanned log forms from an image to a pdf. I used xsane set to lineart for scanning and saved as either .jpg or png which resulted in an image just about the same width and height as an A4 document. With xsane set to lineart and 300 dpi, the pdf files were around 93.5kB. 
  
 <code bash> <code bash>
Line 41: Line 42:
 I will now explain how this script works:  I will now explain how this script works: 
  
-Note that with the exception of the first line, any text prefixed with a hash, //#//, is ignored up until the next new line. Text prefixed with a hash is usually referred to as a comment. Comments can be put on the same line as a command but only after the command. There are no hard and fast rules about using comments. They are handy to explain things to other folks as well as oneself. I normally don't comment a small script as much as this one. Usually I just add some notes at the top and then add what I need as I go along to explain what's happening or why something needs to done a certain way +Note that with the exception of the first line, any text prefixed with a hash, //#//, is ignored up until the next new line. Text prefixed with a hash is usually referred to as a comment. Comments can be put on the same line as a command but only after the command. There are no hard and fast rules about using comments. They are handy to explain things to other folks as well as oneself. I normally don't comment a small script as much as this one. Usually I just add some notes at the top and then perhaps add commets to explain why something is done a certain way for future reference. 
- +<code bash>
-<code>+
 #!/bin/bash #!/bin/bash
 </code> </code>
  
-The first line of my script begins with the two characters " # " and " ! ". Since files are seen by programs as streams of data, a method is required to determine the format of a particular file within the filesystem. Different operating systems have traditionally taken different approaches to this problem.*   In the case of Unix and in our case Linux, " #! " will tell the kernel to treat the file as an executable script and not a machine code program. "/bin/bash" declares the path to the command interpreter that will be used. In the instance //bash//+The first line of my script begins with the two characters " # " and " ! ". Since files are seen by programs as streams of data, a method is required to determine the format of a particular file within the filesystem. Different operating systems have traditionally taken different approaches to this problem.*   In the case of Unix and in our case Linux, " #! " will tell the kernel to treat the file as an executable script and not a machine code program. "/bin/bash" declares the path to the command interpreter that will be used. In the instance //bash//
  
-<code>+<code bash>
 input_file=$1 input_file=$1
 </code> </code>
Line 55: Line 55:
 This line is used to assign a variable to //input_file// using the first string of text, i.e. a file name that has been entered after the command con2pdf. More than one variable can be passed to a script when it is run and they would be numbered $1, $2, etc, but I only want to pass the name of the input file to the script in this instance.  This line is used to assign a variable to //input_file// using the first string of text, i.e. a file name that has been entered after the command con2pdf. More than one variable can be passed to a script when it is run and they would be numbered $1, $2, etc, but I only want to pass the name of the input file to the script in this instance. 
  
-<code>+<code bash>
 test -n "$input_file" test -n "$input_file"
 </code> </code>
  
-This line uses //test// a bash built in command (builtin) to test if the variable is a non zero string, i.e. if a file name was passed to the script when the command //contopdf// was run. //Test// will exit with an exit status of 0 (true) if //input_file// is a non zero string and 1 (false) if //input_file// is not a non zero string. The exit does not print to stdout but it can be assigned as variable using //$?// and can then be evaluated using an //if statement//.+This line uses //test// a bash built in command (builtin) to test if the variable is a non zero string, i.e. if a file name was passed to the script when the command //contopdf// was run. //Test// will exit with an exit status of 0 (true) if //input_file// is a non zero string and 1 (false) if //input_file// is not a non zero string. The exit code does not print to stdout but it can be assigned as the variable //$?// and can then be evaluated using an //if statement//.
  
-<code>+<code bash>
   if [ $? -eq 1 ]; then   if [ $? -eq 1 ]; then
     echo -e "\nUsage: con2pdf [input file]\n"     echo -e "\nUsage: con2pdf [input file]\n"
Line 77: Line 77:
  
  
-<code>+<code bash>
 output_file=`echo "$input_file" | awk -F "." '{ print $1 }'`.pdf output_file=`echo "$input_file" | awk -F "." '{ print $1 }'`.pdf
 </code> </code>
  
-Instead of passing both an input filename and an output (save) filename to the script the next line to creates and assigns an output filename to the variable //output_file//. Variables can be assigned using the output of a command when the command is enclosed in two backticks, //`//, which is the symbol below the tlde.+Instead of passing both an input filename and an output (save) filename to the script the next line to assign an output filename to the variable //output_file//. Variables can be assigned using the output of a command when the command is enclosed in two backticks, //`[command`//.
  
-In the command echo is used to print the variable //**input_file//** but instead of printing to stdout it is redirected with a pipe to //**awk//**. +In this line echo is used to print the variable //input_file// but instead of printing to stdout it is redirected with a pipe to //awk//.
- +
-//**Awk//** or //gawk// is a pattern matching program. Here the flag //**-F//** is used to declare //**"."//** (full stop)as the field separator. For example scanned_image.png consists of two fields separated by a full stop. Awk will print the first field, //**$1//** (scanned_file) to stdout. +
- +
-Note //**.pdf//** on the same line, after the second backtick. This appends //**.pdf//** to //**$1//** so if //**$1//** was scanned_file, the variable //**output_file//** would be scanned_file.pdf+
  
 +//Awk//, or //gawk//, is a pattern matching program. Here the flag //-F// is used to declare //"."// (full stop) as the field separator. For example, the file name //scanned_image.png// consists of two fields separated by a full stop. Awk will print the first field, //$1// (scanned_file) to stdout.
  
 +Note //.pdf// on the same line, after the second backtick. This appends //.pdf// to //$1// so if //$1// was scanned_file, the variable //output_file// would be scanned_file.pdf.
  
 +You will find that there are often more than one way to do something when scripting. The command //cut// could also have been used in place of awk.
  
 +<code bash>
 +output_file=`echo "$input_file" | cut -d. -f1`.pdf
 +</code>
  
 +Field separators are also referred to as delimiters. In the above line, //-d.// nominates full stop as the delimiter and //-f1// selects field 1 for printing to stdout.
  
 <code bash> <code bash>
-#!/bin/bash +convert $input_file $output_file
-############################################################ +
-# con2pdf-gui +
-# Intended to convert scanned document to pdf document  +
-# providing gui interface for file selection and saving. +
-# requires awk +
-# requires convert (from ImageMagick) +
-# requires dirname +
-# requires zenity to create gui file selection dialogs +
-# zenity requires GTK+ +
-############################################################+
  
-input_file=$(zenity --file-selection --title "Select an image file to convert to pdf")+rm $input_file 
 +</code>
  
-working_dir=$(dirname $input_file)+The next two lines need little explanation. 
  
-cd $working_dir+//Convert// is is an Image Magick utility that converts images from one format to another. The file extension //.pdf// appended to the variable //output_file// ensures that the scanned document image will be converted to pdf format.
  
-output_file=$(zenity --file-selection --save --title "Where do you want to save the pdf file?"\ +I did not want save the document images so the next line deletes the image file.
---directory $working_dir --confirm-overwrite)+
  
-convert $input_file $output_file +---- 
- +//I almost always have a terminal open so my scripts are usually intended to be run on the command line. After saving the scanned image into the directory where the relevant pdf records were kept I would //cd// into that directory and run the command //con2pdf [image name]//.
-# end of script +
-</code>+
  
 +In the next section I'll show how to modify //con2pdf// so that it will have a gui interface for both selecting the image file and selecting a path and name for the resulting .pdf file//
 +----
 +**Cheers!**
tutorials/bash_scripting/part1.1330060366.txt.gz · Last modified: 2017/10/12 21:58 (external edit)