Let's Talk About .NET, Java, and Various File Formats!

PDF and Its Structure

PDF stands for Portable Document Format. It is an open standard for document exchange. A PDF file contains both text and binary data. When a PDF file is viewed using a text editor, one can see only the raw objects which form the contents and structure of the PDF file.

The PDF file is structured in hierarchical manner. This structure defines a flow by which a PDF viewer application reads the contents in a sequence and draws them on the screen. The syntax of a PDF file can be described at three levels — object, file and document.

In order to better understand the structure of a PDF file, we need to consider it in four parts — objects, file structure, document structure, and content stream. In the following paragraphs, we’ll have a look into these individual parts of the PDF file.

Objects

A PDF file is composed of small sets of basic types of data objects. These basic data objects collectively form a PDF document’s data structure. These objects include the character set which is used to write these objects and other syntactical elements. The basic types define the properties of the objects and the syntax as well.

File Structure

The second part of the PDF document is file structure. The way basic objects are stored in the PDF file and later accessed or updated is defined by the file structure. The file structure is independent of the semantics of the objects; this means that the file structure is only responsible for organizing and updating the objects.

Document Structure

The document structure actually describes that how the basic objects are grouped together to form various components of the PDF file. These components can be pages, annotations, form fields etc. So, in fact, this part describes the semantics of the components of the PDF file.

Content Stream

A sequence of instructions which describe the appearance of any graphical entity is represented in the form of content stream. The content stream is also composed of objects, however these objects are distinct from the basic types of data objects.

What is Green Computing?

Green Computing refers to environmentally sustainable computing. San Murugesan defines the field of green computing as “the study and practice of designing, manufacturing, using, and disposing of computers, servers, and associated subsystems—such as monitors, printers, storage devices, and networking and communications systems—efficiently and effectively with minimal or no impact on the environment.”

Bits, Bytes, and CO2!

The green computing is not limited to the hardware; it also encompasses the software and the data. Whenever the data is processed, it takes computer resources; the more data we process, the more carbon dioxide (CO2) we add to the environment. The electricity consumed by cloud computing globally will increase up to 1,963 billion kWh by 2020 and the associated carbon dioxide equivalent emissions would reach 1,034 megatons.

Write Efficient Software

One way to reduce the CO2 emissions is to use energy-efficient hardware. IT industry can also contribute to the green computing by increasing the use of renewable energy. In addition to that, efficient software, which can process the same amount of data with fewer lines of code, in an effective way, avoiding redundancy to save CPU power can also greatly contribute in this area.

Role of Open File Formats

The role of open file formats is no less in contributing to green computing. According to Wikipedia, an open file format is a published specification for storing digital data, usually maintained by a standard organization, which can therefore be used and implemented by anyone.

The proprietary file formats or standards can work for particular platforms with certain software, while open file formats can work on any platform with a variety of tools and software for various purposes. The closed or proprietary formats are more likely to become obsolete sooner or later and leaving behind a pile of digital footprint as ‘digital waste’. This digital waste is going to contribute in the carbon dioxide emissions for no reason. While there are less chances of all this happening with the open file formats.

Our Social Responsibility

The IT industry should focus on the efficient hardware and software, use of renewable energy, and open file formats to contribute to the green computing in the best interest of the generations to come.

Some times, developers need to produce the CIL in such a way that it meets type safety requirements. This type-safe code can only be generated by avoiding certain language constructs. In order to make sure that the .NET assembly contains type-safe code, you need to verify the assembly. The assembly can be verified using PEVerify utility from Microsoft.

If you have installed Visual Studio on your computer, this utility (peverify.exe) will already be there. It provides various options, however the simplest way to verify the assembly is to use peverify.exe along with the assembly path/name.

It will take a while and show you the result. If the assembly contains type-safe code, the following message will be shown on the command window: All Classes and Methods in <assembly path/name> Verified.