Remote Command Injection Karteek Docsplit 0.5.4


4/1/2013
Larry W. Cashdollar
@_larry0

User supplied input isn't sanitized against shell metacharacters and is fed directly to the shell. If the user is tricked into extracting a file with shell characters in the name code can be executed remotely.

https://rubygems.org/gems/karteek-docsplit

./karteek-docsplit-0.5.4/lib/docsplit/text_extractor.rb

 59     def extract_from_ocr(pdf, pages)
 60       tempdir = Dir.mktmpdir
 61       base_path = File.join(@output, @pdf_name)
 62       if pages
 63         pages.each do |page|
 64           tiff = "{tempdir}/{@pdf_name}{page}.tif"
 65           file = "{basepath}{page}"
 66           run "MAGICKTMPDIR={tempdir} OMP_NUM_THREADS=2 gm convert -despeckle +adjoin #{MEMORY_ARGS} #{OCR_FLAGS} {pdf}[{page - 1}] #{tiff} 2>&1"
 67           run "tesseract #{tiff} {file} -l eng 2>&1"
 68           clean_text(file + '.txt') if @clean_ocr
 69           FileUtils.remove_entry_secure tiff
 70         end
 71       else
 72         tiff = "{tempdir}/{@pdf_name}.tif"
 73         run "MAGICK_TMPDIR={tempdir} OMP_NUM_THREADS=2 gm convert -despeckle #{MEMORY_ARGS} #{OCR_FLAGS} #{pdf} #{tiff} 2>&1"
 74         run "tesseract #{tiff} #{base_path} -l eng 2>&1"
 75         clean_text(base_path + '.txt') if @clean_ocr
 76       end

Run is defined as:

 94     def run(command)
 95       result = `#{command}`
 96       raise ExtractionFailed, result if $? != 0
 97       result
 98     end