4/1/2013 Larry W. Cashdollar @_larry0 User supplied input isn't sanitized against shell metacharacters and is fed directly to the shell. If the user is tricked into extracting a file with shell characters in the name code can be executed remotely. https://rubygems.org/gems/karteek-docsplit ./karteek-docsplit-0.5.4/lib/docsplit/text_extractor.rb 59 def extract_from_ocr(pdf, pages) 60 tempdir = Dir.mktmpdir 61 base_path = File.join(@output, @pdf_name) 62 if pages 63 pages.each do |page| 64 tiff = "#{tempdir}/#{@pdf_name}_#{page}.tif" 65 file = "#{base_path}_#{page}" 66 run "MAGICK_TMPDIR=#{tempdir} OMP_NUM_THREADS=2 gm convert -despeckle +adjoin #{MEMORY_ARGS} #{OCR_FLAGS} #{pdf}[#{page - 1}] #{tiff} 2>&1" 67 run "tesseract #{tiff} #{file} -l eng 2>&1" 68 clean_text(file + '.txt') if @clean_ocr 69 FileUtils.remove_entry_secure tiff 70 end 71 else 72 tiff = "#{tempdir}/#{@pdf_name}.tif" 73 run "MAGICK_TMPDIR=#{tempdir} OMP_NUM_THREADS=2 gm convert -despeckle #{MEMORY_ARGS} #{OCR_FLAGS} #{pdf} #{tiff} 2>&1" 74 run "tesseract #{tiff} #{base_path} -l eng 2>&1" 75 clean_text(base_path + '.txt') if @clean_ocr 76 end Run is defined as: 94 def run(command) 95 result = `#{command}` 96 raise ExtractionFailed, result if $? != 0 97 result 98 end