peepdf - PDF Analysis Tool


peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not.

The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks. With peepdf, it's possible to see all the objects in the document showing the suspicious elements, supports the most used filters and encodings, it can parse different versions of a file, object streams, and encrypted files. 

With the installation of PyV8 and Pylibemu, it provides Javascript and shellcode analysis wrappers too. Apart of this, it is able to create new PDF files, modify existent ones and obfuscate them.

The main functionalities of peepdf are the following:

Analysis:
  • Decodings: hexadecimal, octal, name objects
  • More used filters
  • References in objects and where an object is referenced
  • Strings search (including streams)
  • Physical structure (offsets)
  • Logical tree structure
  • Metadata
  • Modifications between versions (changelog)
  • Compressed objects (object streams)
  • Analysis and modification of Javascript (PyV8): unescape, replace, join
  • Shellcode analysis (Libemu python wrapper, pylibemu)
  • Variables (set command)
  • Extraction of old versions of the document
  • Easy extraction of objects, Javascript code, shellcodes (>, >>, $>, $>>)
  • Checking hashes on VirusTotal

Creation/Modification:
  • Basic PDF creation
  • Creation of PDF with Javascript executed when the document is opened
  • Creation of object streams to compress objects
  • Embedded PDFs
  • Strings and names obfuscation
  • Malformed PDF output: without endobj, garbage in the header, bad header...
  • Filters modification
  • Objects modification

Execution modes:
  • Simple command line execution
  • Powerful interactive console (colorized or not)
  • Batch mode

Usage:

Usage: ./peepdf.py [options] PDF_file

Options:
  -h, --help            show this help message and exit
  -i, --interactive     Sets console mode.
  -s SCRIPTFILE, --load-script=SCRIPTFILE
                        Loads the commands stored in the specified file and
                        execute them.
  -c, --check-vt        Checks the hash of the PDF file on VirusTotal.
  -f, --force-mode      Sets force parsing mode to ignore errors.
  -l, --loose-mode      Sets loose parsing mode to catch malformed objects.
  -m, --manual-analysis
                        Avoids automatic Javascript analysis. Useful with
                        eternal loops like heap spraying.
  -u, --update          Updates peepdf with the latest files from the
                        repository.
  -g, --grinch-mode     Avoids colorized output in the interactive console.
  -v, --version         Shows program's version number.
  -x, --xml             Shows the document information in XML format.


$ ./peepdf.py -i


PPDF> help

Documented commands (type help ):
========================================
bytes           errors       js_eval           open          sctest    
changelog       exit         js_join           quit          search    
create          filters      js_unescape       rawobject     set       
decode          hash         log               rawstream     show      
decrypt         help         malformed_output  references    stream    
embed           info         metadata          replace       tree      
encode          js_analyse   modify            reset         vtcheck   
encode_strings  js_beautify  object            save          xor       
encrypt         js_code      offsets           save_version  xor_search  



No comments

Powered by Blogger.