Class PDF::Reader::ObjectHash
In: lib/pdf/reader/object_hash.rb
Parent: Object

Provides low level access to the objects in a PDF file via a hash-like object.

A PDF file can be viewed as a large hash map. It is a series of objects stored at precise byte offsets, and a table that maps object IDs to byte offsets. Given an object ID, looking up an object is an O(1) operation.

Each PDF object can be mapped to a ruby object, so by passing an object ID to the [] method, a ruby representation of that object will be retrieved.

The class behaves much like a standard Ruby hash, including the use of the Enumerable mixin. The key difference is no []= method - the hash is read only.

Basic Usage

    h = PDF::Reader::ObjectHash.new("somefile.pdf")
    h[1]
    => 3469

    h[PDF::Reader::Reference.new(1,0)]
    => 3469

Methods

[]   deref   deref!   each   each_key   each_pair   each_value   empty?   encrypted?   fetch   has_key?   has_value?   include?   key?   keys   length   member?   new   obj_type   object   page_references   sec_handler?   size   stream?   to_a   to_s   value?   values   values_at  

Included Modules

Enumerable

Attributes

default  [RW] 
pdf_version  [R] 
sec_handler  [R] 
trailer  [R] 

Public Class methods

Creates a new ObjectHash object. Input can be a string with a valid filename or an IO-like object.

Valid options:

  :password - the user password to decrypt the source PDF

Public Instance methods

Access an object from the PDF. key can be an int or a PDF::Reader::Reference object.

If an int is used, the object with that ID and a generation number of 0 will be returned.

If a PDF::Reader::Reference object is used the exact ID and generation number can be specified.

deref(key)

Alias for object

Recursively dereferences the object refered to be key. If key is not a PDF::Reader::Reference, the key is returned unchanged.

iterate over each key, value. Just like a ruby hash.

iterate over each key. Just like a ruby hash.

each_pair(&block)

Alias for each

iterate over each value. Just like a ruby hash.

return true if there are no objects in this file

Access an object from the PDF. key can be an int or a PDF::Reader::Reference object.

If an int is used, the object with that ID and a generation number of 0 will be returned.

If a PDF::Reader::Reference object is used the exact ID and generation number can be specified.

local_default is the object that will be returned if the requested key doesn‘t exist.

return true if the specified key exists in the file. key can be an int or a PDF::Reader::Reference

return true if the specifiedvalue exists in the file

include?(check_key)

Alias for has_key?

key?(check_key)

Alias for has_key?

return an array of all keys in the file

length()

Alias for size

member?(check_key)

Alias for has_key?

returns the type of object a ref points to

If key is a PDF::Reader::Reference object, lookup the corresponding object in the PDF and return it. Otherwise return key untouched.

returns an array of PDF::Reader::References. Each reference in the array points a Page object, one for each page in the PDF. The first reference is page 1, second reference is page 2, etc.

Useful for apps that want to extract data from specific pages.

return the number of objects in the file. An object with multiple generations is counted once.

returns true if the supplied references points to an object with a stream

return an array of arrays. Each sub array contains a key/value pair.

value?(check_key)

Alias for has_key?

return an array of all values in the file

return an array of all values from the specified keys

[Validate]