Document Objects
Body of Knowledge |
---|
Document Production Workflow |
Lifecycle Category |
Document Objects |
Content Contributor(s) |
William Broddy m-edp |
Original Publication |
August 2014 |
Copyright |
© 2014 by Xplor International |
Content License |
CC BY-NC-ND 4.0 |
What are Document Objects?
When we compose a series of documents, we use document objects within the document(s) to create:
- typography,
- barcodes,
- business graphs,
- photos and scanned images,
- text blocks and tables, and
- color information.
There are three general categories for document objects: vector graphics, pre-rasterized images, and compound objects.
Vector Graphics: A series of instructions describing the object withdraw and fill commands. Typographic fonts, barcodes, and business graphs are usually described in this way.
Pre-rasterized Images: A matrix of pixel-points that describes where to place black dots, grey-shaded dots, or color dots. Photographs and scanned images are the most common types of raster objects. However, legacy applications may still use pre-rasterized images for logos or fonts.
Figure 1 - Comparison of pre-rasterized vs. vector objects (https://commons.wikimedia.org/wiki/File:Bitmap_VS_SVG.svg).
Compound Objects: A series of layout commands and objects (or references to objects) that define a table, mailing address block, or message. We use compound objects to define a discrete area (also known as zone or region) of a page or document that is independent of other areas. More sophisticated compound documents also manage color attributes and profiles for the content within the object.
Document objects are placed and customized based on business rules and by data associated with the recipient or with the type document.
There are two things required to compose a document: data that is unique to the specific document or series of documents and document objects, pre-defined parts of a document series.
Document objects are created as part of the document development process and stored in the application libraries associated with the document composition or modification process. Different object-types are used to maximize document fidelity (to ensure that the recipient views the document as it is intended to look. By using widely supported document object-types document portability is maximized. Document objects are also tuned to optimize the speed that the document can be rendered for printing, or for viewing on a screen.
In the early days of laser printing and screen display, circa 1980, virtually all document objects were stored as raster images. This was because the Raster Image Processor (RIP) could not render vector commands into raster images at a sufficient speed. Today most print manufacturers take advantage of video processors developed for video gaming which are tuned to render vector graphic objects into raster patterns very quickly. Today, any document object that can be easily described in vector commands should be created as a vector graphic object, e.g. logos, icons, fonts, barcodes.
Aliases and Terminology
Print resources: A name for legacy document objects that reside in libraries attached to the printer or print-server. These are fonts, logos, and text blocks (e.g. pagedefs and formdefs). These print resources are bound to the print file as it is sent to the printer RIP. This was historically done to minimize the size of print files when telecommunication bandwidth was very limited. Virtually all print resources are document objects. However, there are newer document objects (such as color management objects) that don’t conveniently fit into the print resources category.
Zones / regions: The area in which a text block resides. For example, the address text block may also be referred to as the address zone or address region. However, an address region may not be a discrete text block; it may be line text positioned from the top left corner of the page and the text block it is within may also contain other information not associated with the address, e.g. the date of issue.
Hard vs. soft resources: The same document object is often used repetitively within a document or series of documents, e.g. the corporate logo. To eliminate re-rendering each time the object is used it may be defined as a hard resource so that it can be rendered once and then cached for repeated placement. A soft resource may be used only once, such as a text block describing a table of individual transactions for specific recipient.
Document Object Management
The development and use of document objects must be properly managed as part of the ongoing maintenance and upkeep of document applications. For example, many organizations use document objects to produce corporate logos and trademarks on their documents. Over time, instances of these objects will reside in many difference libraries or hard-coded into applications. If not properly managed, it may take months to introduce the corporate logo in all the locations it is found.
With the introduction of message management within transaction documents, we need to tightly manage the development, test and promotion of compound objects containing marketing messages. These marketing messages are often encapsulated with vector- graphics and/or photographic images, and usually include references to typographic fonts. Marketing departments typically have much shorter deadlines for launch into production than the classic IT change control process allows. Therefore, the tools that manage development, test and launch of these objects follow a different development path than other document objects.
General Object Categories
Document objects have evolved over the past 35 years since the first laser printers were introduced. The generations of specific object-types are listed in reverse chronological order.
Vector-Based Objects
The current generations of the following document objects are usually created using vector commands. However, there may be legacy formats still in use that date back to the early days of laser printing that may be in raster format.
Vector-based objects exploit the power of processing chips designed primarily for video game display. These chips can render vector-based objects at incredible speeds. Describing an object using vector commands provides a number of advantages:
- The object size is usually much smaller than an identical looking object stored as a raster object.
- The RIP processor can turn the object into a raster image that is the correct resolution and color profile for the printer (or video display) on which it’s rendered, e.g. a vector font character will be converted to a 600 dpi image that can then be enhanced beyond 600 dpi through antialiasing on the specific printer. If the font character is to be printed in a specific RGB color, the RIP can tune the CMYK color percentages used based on the color profile of the printer.
- The object can be oriented in any direction (0 – 360º) without affecting the fidelity of the object.
- The object can be easily resized with affecting the fidelity of the object, i.e. the roughness of the edges and the graininess of the fill. Examples of vector commands include:
- draw line,
- draw circle,
- draw box,
- draw ellipse, and
- fill.
Historically, vector-based objects were introduced in the desktop publishing environment by Apple and Adobe. The speed of the RIP was not as important as the quality of the rendered master because the master would then be used to create hundreds, if not thousands, of copies.
The adoption of vector-based objects in high speed electronic printing was governed by:
- The availability of affordable high speed vector processing chips for use in the printer’s RIP.
- The ability to group vector processing chips to handle parallel object rendering.
Since every page printed is really a master, the RIP must be able to render pages much faster, usually 2 – 3 times faster, than the actual speed at which the pages are printed. With inkjet color printers approaching 5,000 (full-color) impressions per minute, the RIP would likely have to render at 40,000 monochrome impressions per minute (5,000 imp x 4 colors x 200%).
Terms Used in Typography
A font generally refers to the physical form of a typographic design or typeface. It can be a case of wood or metal characters or a computer file containing raster or vector information. For our purposes a font is the instantiation of a specific typographic face, height, weight, and style (e.g. Helvetica 12pt Medium Italic). When the first electronic (rasterized) fonts were distributed they were packaged in the traditional way by height, weight and style.
With the introduction of outline fonts there are no longer unique objects for each size and character in a typeface, so today it is common to refer to the entire font family as a font. When licensing fonts be sure to read the fine print to ensure what rights you are granted by the license and what variations are included. Remember that fonts acquired from “free font” sites or compilation CDs may not be licensed for corporate use.
Typeface
Figure 2 - Example of the Frutiger typeface used for signage as originally intended (https://commons.wikimedia.org/wiki/File:Loc.-Lugano.PNG).
A typeface (also known as a font family or even a font) is a series of sizes, weights, and styles that are designed for a common, and complementary, look. A type designer develops a typeface, based on either a commission brief (as in the case of Frutiger) or as a general offering, (as in the case of Avenir). Once the designer has created the typeface it may be distributed through a type foundry or independently. A typeface may include a wide range of variations that include Oblique, Semibold and Ultrabold or they may be limited to Roman, Italic, Normal and Bold.
Font Height
Height is specified in points. Prior to the 1980s the size of a point might have varied by foundry, but since the 1980s when desktop publishing emerged, points have been defined as 72nds of an inch. The firm definition is based on the displays in that era had a 72 dot per inch resolution.
The point size describes the vertical height of the character box in which the character resides. The character box for a font is large enough for the tallest ascender in font as well as the lowest descender. A 10 pt. (point) font would have a character box that is 10/72ths of an inch high.
There are key measurements that are part of describing a character, beyond the point size.
X-height: The x-height is the height of the lower case ‘x’ in a font. The ‘x’ is usually the height of the body of all lower case characters, not including the ascender and descender strokes.
Ascender: Ascenders are the strokes that appear above the x-height of the character.
Descender: Descenders are the strokes that appear below the x-height of the character.
Font Weight
The designer may create additional weights of the font by thickening the strokes within the character. Most fonts will have a bold weight. However, there can many weights within a font, hypothetically up to 800 with OpenType and TrueType. The ISO/IEC 9541 Font Architecture1[1] specifies nine weights:
- Ultralight,
- Extralight,
- Light,
- Semilight,
- Medium (normal),
- Semibold,
- Bold,
- Extrabold, and
- Ultrabold.
Pitch
The width of mono-spaced characters, based on the approximate number of characters that can be placed in a linear inch, for example, 10-pitch has approximately 10 characters per inch. Most line-print applications are set for a 10-pitch character set, but there are options for 12-pitch and 15-pitch in some environments.
We say approximate because the mechanical print trains used in true line printers used identically spaced characters. Digital fonts approximate the fixed character widths but there is some variation.
Font Style
There are three major styles within a font family.
Roman: The Roman, or regular, style is used for text within the body of a document. The characters are upright, not slanted. Most information that we read is set in Roman style.
Italic or Oblique: Characters slope or slant in an italic (or oblique) style. Oblique styled fonts are slated to the right but based on the Roman form of the characters. Italic fonts are also slanted to the right but use different glyphs for the characters.
Condensed: Condensed characters are designed to look similar to the regular face but have a noticeably narrower width. Condensed fonts are often used in signage that could be read at a severe angle, such as shelf labels. We sometimes used a condensed font in a column heading where we want a larger point size that doesn’t take as much width.
Ligatures: Designers often include an extended set of characters that are either a combination of 2 – 3 characters (e.g. fl) or a diacritic, a character that is not in our modern alphabet (e.g. æ and œ).
Figure 3 - Ligatures (https://commons.wikimedia.org/wiki/File:Ligatures.svg).
The ampersand (&) character was originally a ligature for ‘E’ and ‘t’ meaning ‘et’, the Latin word for ‘and’. The Trebuchet font’s ampersand character, &, is close to the original ligature.
Icons: There are a number of symbol characters that are used to represent words, and often transcend one language. Using icons allows us to express a thought in one character that may take a word, or phrase, in full text. They can be a very effective way to catch the reader’s attention and to save space.
Extended characters in most fonts:
™ - Unregistered trademark
® - Registered trademark
© - Copyrighted material
£ - UK or Irish Pound
€ - Euro
Kerning
Certain pairs of characters look better when they are nudged together. The designer can add ‘hinting’ to the digital font, describing how to kern a specific pair of characters.
Typographic Fonts
Typographic fonts are used to present textual information within a document. By choosing the correct font or fonts for a document we improve its readability.
We use fonts for different purposes. With some documents, such as marketing brochures and posters, we use stylized text to capture the customer’s attention. With other documents, such as client statements and bills, user manuals, and legal agreements, we want to make the document as readable and comprehensible as possible. Even with the clearest writing we can still make a document unreadable by choosing the wrong font, the wrong size, the wrong spacing, and the wrong column length.
Readability
Contrast: There are a number of things that can make textual information more readable. The most important is the contrast between the characters and the background; you want as much contrast as possible, i.e. 100% black (RGB 000 000 000 / CMYK 0 0 0 100%) on absolute white (RGB 255 255 255 / CIE 145 paper).
Word shape: The next most important is the shape of the words. This is controlled by using mixed case text, providing enough white (quiet) space around the word, and the choice of font face.
As we learned to read we actually were taught to read word-shapes — the series of black strokes within characters positioned close enough together that we recognize a word blob, not individual letters. If there is not enough white space around the words, or the actual shape of the word is unrecognizable because of the shape of the characters, we revert back to looking at individual characters to spell out the word. The most glaring example of this is presenting words in upper case.
Number readability and columnar integrity: Most recipients of transaction documents are most interested in the numeric information presented. It’s important that numbers in a font are distinct; they typically are shown as 6 – 8 digits.
Many of the most popular corporate fonts have numbers that are indistinguishable from each other when presented in a seven digit string at 10 pt. The numbers that are easiest to distinguish have 6’s with an upward-pointing stroke and 9’s with a downward-pointing stroke. Conversely, 6’s with strokes that curl back into the character are hard to distinguish from 8’s, 3’s, and 9’s.
It is also important that the numbers used in columns all have the same width. Many fonts contain variable width numbers that look good in descriptive text but do not properly line up in a column. Although most state-of-the-art composition systems are capable of mitigating this problem, check that the font contains proper hinting on how to render the numeric character in a column.
Font Structure
Serif vs. Sans Serif: Is it descriptive text, a table or a heading?
One of the major classifications of fonts is serif versus sans (without) serif. Serifs are the strokes, or glyphs, that are added to the character to provide more differentiation and to speed up the reading process. There are heated debates amongst graphic designers and typographers as to which style is more readable. Many preferences are driven by cultural, geographic, and educational backgrounds.
Serif Fonts
Historically, font faces with serifs were used to present descriptive information (sentences and paragraphs). Most literature is set in a serif font, as are most newspapers.
It’s important to understand how our eyes perceive written words. When reading descriptive information, our eyes jump across the line of text, pausing for a fraction of a second to recognize a series of 3 – 5 words, before jumping to next part of the line. This is known as a saccadic jump. The optimal line of textual information is 60 characters (or 8 – 10 words), resulting in 2 – 3 saccadic jumps per line[2]. The eye then travels back through the whitespace below the line to find the next line (a reason to have plenty of white space, or leading, between lines of text).
If the eye does not recognize the word shape(s), the eye moves back on the line to interpret the characters and context (known as regression)[3]. Serifs direct the eye to move quickly across the line in a paragraph of information, and make out the word shapes better. Most adults can more quickly read a 1,000 word article that is set in a serif font[4].
Some of the most commonly used licensed serif fonts include[5]:
- Baskerville
- Garamond
- Caslon
- Rockwell
- Bembo
- Minion
- CenturySabon
- Bodoni
- Palatino
- Clarendon
- Times New Roman
Sans Serif Fonts
In the chronology of fonts, sans serif (or groteske) fonts were introduced much later than serif fonts, approximately the same time as the industrial revolution. They were originally intended for signage and posters because they looked cleaner and more industrial. They have become more commonly used for scientific, transportation, and financial tables.
Product and procedural manuals are set in san serif fonts, as is most scientific and engineering research. Through this evolution, sans serif fonts have become associated with quantitative and tabular information. Some of the most commonly used sans serif fonts include:
- Akzidenz Grotesk
- Helvetica
- Avenir
- Lucida Sans
- Frutiger
- Myriad
- Futura
- Optima
- Gill Sans
- Univers
Other Types of Fonts
Script fonts: These fonts (also called cursive fonts) mimic calligraphy and handwriting. They are used to mimic handwritten notes or a signature. They should be used sparingly.
Ornamental fonts: These fonts contain special characters that are not found within a standard font. These characters, like a finger pointing, can be used in bulleted lists to embellish a block of text. They are not usually used within unbulleted descriptive (business) information. Common ornamental fonts include dingbats, web bats, and ornamentals. Examples of their character sets are shown below:
Ornamental fonts should be used sparingly because they add an additional series of objects to manage and render at the printer. They can also impact search indexing as the hex value for the character may be the same as an alphanumeric character in a regular font.
Asian Language Fonts
Written Asian languages, including Chinese, Japanese, Korean, and Vietnamese, each have thousands of characters. Traditional Latin language fonts contain less than 255 characters, which allows them to be identified in a single byte. Asian language fonts use multiple bytes to identify a character, allowing hundreds of thousands of characters within the font . The term double-byte is used to describe fonts that use 2 bytes per character.
Multi-byte fonts are usually issued for each Asian language, Hànzì in Chinese, kanji in Japanese, hanja in Korean, and Chữ Nôm in Vietnamese. However, you can also buy Chinese / Japanese / Korean (CJK) packs. The Asian language fonts include all major Latin, Cyrillic, math, and financial characters.
Asian languages can be written either horizontally or vertically. Asian characters sit in a fixed square space (including the Latin characters). Most business correspondence is written horizontally, but there could be instances where an informational, educational or marketing message could be written vertically.
Although most European and North/South Americans may not think about the inclusion of Asian fonts in a document production system, there is a growing need (especially within Wealth Management) to provide client support messages for Asian customers in their language of choice. Also, Asian multinationals have developed worldwide client-support systems that use full Asian language (double or multi-byte) databases, and generate transaction documents that use Asian language fonts.
There are many different formats for Asian language fonts:
- multi-byte,
- singletons, lead unit, trail units,
- double-byte,
- IBM DBCS (EBCDIC based),
- shift in/shift out,
- CJK DBCS (ASCII based),
- Big5, GB, S-JIS, KSC,
- Unicode,
- UTF-8 = variable-width encoding (maximizes compatibility with ASCII),
- UTF-16 = variable-width encoding, and
- UTF-32 = fixed-width encoding.
Choosing the Correct Font
We choose the fonts based on a number of factors.
Style Guidelines
Corporations and governments often have preferred fonts that they use in all customer communication. These fonts are often specified in their corporate style guide, which is usually available from the corporate communications department. Company brochures, signage, and advertising text will typically use a single licensed font[6] (or one serif font and one sans serif font). Because web font formats are not always available the brand guidelines may also call out a web font that has a similar look.
Whenever possible, the mail owner should use the same typographic guidelines for transaction documents as they would use in other printed customer documents.
Most current composition systems and production printers can properly exploit current vector-based licensed fonts. However, many legacy applications that create line-print files, Xerox Metacode files, or very early versions of AFP files may require a legacy raster font to print. While some have similar names to the popular fonts like Helvetica and Times, they are only approximations.
Most licensed fonts have not been easy to use in a web environment. As a result, document owners will sometimes choose one of the core fonts within HTML. With the more recent levels of HTML (4 and higher), some licensed fonts can now be incorporated into webpages. However, the legal issues associated with web font licenses may still be an inhibitor.
One of the advantages of presenting electronic documents in PDF format is that licensed fonts (and font licensing) can be easily handled.
Font Use
Although there are no strict rules on serif versus sans serif usage, there some conventions, based on consumer familiarity:
- headings are typically in a sans serif font,
- tabular information is usually in a sans serif font,
- telephone numbers, email addresses, street addresses, and websites are usually in a sans serif font,
- descriptive information is usually in a serif font, and
- descriptive call-outs are usually in an italicized serif font.
MICR
MICR is not a font, but a technology with several components. The Magnetic Ink Character Recognition (MICR) process uses magnetically charged characters that are read at extremely high speeds. The read-head on the sorter is similar to the read- head on a magnetic hard-drive; it recognizes variances in the magnetic properties of the characters that it’s reading. Unlike an Optical Character Reader, it can misread perfectly formed characters if the ink (or toner) does not have the correct magnetic properties. Conversely, it can read malformed characters, as long as the magnetic properties of the characters are correct.
There are two typefaces used to create characters with the right magnetic properties. In the United States of America, Canada, Australia, United Kingdom, Japan, India, Mexico, Colombia, and Turkey, the E13B font (promoted by IBM) is used to encode checks and other payment instruments (often referred to as ‘presentments’). The CMC- 7 font (promoted by Groupe Bull) is used in France, Spain, Israel, South America (except Colombia) and other Mediterranean Countries.
MICR printing remains machine readable, even when over-stamped, marked, or mutilated.
MICR is most commonly used for checks, however, many checks are captured as images as they enter the bank clearance systems, and the images are then processed as though they are physical checks. MICR is also found in direct mail applications.
Common Font Object Formats
The following are common font object formats. They are listed in reverse chronological order:
Open Type
Most licensed fonts are provided in the OpenType font format. The format was developed by Microsoft with Adobe’s participation, and announced in 1996. OpenType exploits vector commands (commonly referred to as outline data in the font world). The creators wanted the structure to take advantage of more advanced ‘quadratic’ vector commands to provide finer detail within a character, and better handle the complexities of non-Latin languages. “OpenType fonts can also include typographic refinements such as true small caps, different styles of figures, and extensive sets of ligatures and alternates, as well as complete sets of accented characters and diacritical marks[7].”
OpenType was intended as a replacement for TrueType and Adobe font formats. To do this, the predecessor fonts needed to be supported. Also, it has evolved to provide extensive web and tablet support. As a result, there are a number of sub-types of OpenType:
- Open Font Format (OFF): ISO adopted a version of the OpenType format in 2007[8] known as Open Font Format (OFF). Unless otherwise specified, an OpenType font will be OFF formatted.
- Web Open Font Format (WOFF): WOFF is a font format that is used within webpages by most browsers. It became a W3C Recommendation in late 2012[9]. The format essentially wraps TrueType, OpenType and Open Font Formats with metadata that allows it to be downloaded, used and erased after use. The fonts are compressed within the transmitted object. The WOFF process enforces a same-origin policy to ensure that a downloaded font does not be used by other (improperly licensed) websites, or by applications on the recipient’s computer.
OpenType supports outline fonts that were originally created for TrueType or for PostScript (CFF). The formats are:
- ClearType rendering technology: Microsoft introduced the ClearType subpixel rendering technology to improve the edges of characters on high resolution displays. It’s available on all Windows operating systems after XP. The technology can render OpenType and TrueType fonts, and there are a limited number of fully defined ClearType fonts available.
- TrueType: The TrueType[10] format was developed by Apple and Microsoft in the late 1980s in competition to Adobe’s PostScript (Type 1) font format. TrueType provided a higher level of scalability for outline fonts compared to Type 1.
The major technical difference is that TrueType uses quadratic Bezier curves, which seem to render quicker, with a higher level of scalability, than cubic curves.
Adobe PostScript Fonts: The Adobe PostScript font was designed by Adobe for professional typesetting. The fonts were described in vector commands called cubic Bezier curves, originally used with graphic plotters and radar screens. These cubic commands do not easily render into a raster environment (common displays and laser printers).
- Type 1: The initial format for Adobe fonts was called Type 1, and was intended for graphic arts applications. The rendering process can be quite slow due to the number of commands required to create a character using the cubic structure.
- Compact Font Format (CFF) - Type 2: In order to improve the speed of rendering to screen and to electronic (lower resolution) printers, Adobe introduced CFF, in conjunction with Adobe Acrobat (Version 3) and Portable Document Format v1.2 (PDF). The CFF format still uses a ‘limited’ cubic structure that subsets to only the necessary commands for ‘low resolution’ rendering. This speeds up the rendering process without visual degradation.
AFP Font Object Content Architecture (FOCA): Most transaction documents are produced using Advanced Function Presentation (AFP) file structures. High speed production printer RIPs used the AFP object structure to optimize rendering speeds and fidelity.
- FOCA is the object type used to AFP-specific fonts within the AFP environment[11]. These fonts have been specifically designed for AFP presentation environments. They include:
- 38PP fonts: The 38PP fonts were the fonts designed for the initial implementation of AFP in the mid-1980s.
- Other fonts supported in the AFP architecture: The AFP architecture (but not FOCA) also supports other font-types as data resources, including:
- OpenType
- TrueType
Core fonts: Microsoft includes a series of fonts within the Windows operating system. The W3C also specifies a series of fonts that are to be included with any W3C HTML (level x and higher) server. Both are known as the core fonts. Personal printer manufacturers usually include a core font set on their printers, based on the then current Windows core fonts.
Many applications[12] will assume that a core font is present on the printer unless you otherwise specify. Problems can occur if Windows removes the font as part of a new version; or if you change the targeted printer, and it does not have the font.
Windows: Microsoft Windows provide a set of fonts to be used with Windows applications12. Microsoft updates the core font set with each major Version. You can check the Microsoft font page to find out which fonts are provided with each operating system.
Web fonts: Microsoft led a project to introduce a family of core web fonts. Although the project was cancelled in 2002, many HTML servers still support references to the following fonts. With the implementation of WOFF, the requirement for predefined fonts is disappearing:
- Andale Mono (replaced by Lucida Console on later Windows operating systems)
- Arial
- Arial Black
- Comic Sans MS
- Courier New
- Georgia
- Impact
- Times New Roman
- Trebuchet MS
- Verdana
- Webdings
Input: Where do fonts come from?
Most fonts are created by professional font designers who sell (via a user license):
- Through distributors (often called foundries, an artifact from the days when font characters were made out of steel). Major font distributors include: Adobe, Monotype (including Linotype and Fonts.com subsidiaries).
- As part of another software product, e.g. Microsoft Windows, Microsoft Office.
- As part of a print management system, or imbedded within a printer’s firmware.
Output: Where do we place fonts for use in the production process?
Font objects are loaded into production libraries associated with:
- composition system software,
- print stream utility software, and
- print management systems.
The software application imbeds the font object into the print stream prior to the print stream being presented to the RIP process for rendering.
Graphics
Graphics and Comprehension
We use graphical representation to improve document comprehension, and as a complement to textual information.
- Graphical branding, such as corporate logos.
- Business graphics, to illustrate critical information:
- Pie or Bar charts are used to show distributions,
- Line charts are used to show activities over time.
Vector commands have a number of advantages over raster graphics, including faster handling by Raster Image Processors (RIPs) to render logos and business graphs. Vector graphics also produce cleaner edges when scaled.
Graphic Object Types
There are hard and soft graphic objects. Hard objects are developed and pre-tested as discrete objects, such as corporate and product logos. The object is stored in final format and then loaded into the print stream as a defined resource that can be rendered once and then repetitively placed.
Soft business graphics are generated during composition based on stored business rules. The RIP renders the graphic and places it immediately.
The most common graphic types are:
- Scalable Vector Graphics (SVG): SVG is a World Wide Web Consortium (W3C) recommendation that provides two-dimensional vector graphic support[13]. It allows objects to be described using graphic commands. All current web browsers support SVG and most smartphones and tablets support the SVG mobile subset. SVG can include pre-rasterized images and vector fonts. Not all composition systems, Digital Front Ends on printers, or RIPs support the SVG object type.
- Encapsulated PostScript (EPS): Most graphic arts applications support EPS as an output format. Although EPS files are usually formatted pages or documents, you can use it to develop objects, including logos. You can include vector commands, pre-rasterized images and vector-based fonts.
- AFP Graphic Object Content Architecture (GOCA): GOCA is the AFP object type that supports generic vector-graphics, such as business graphs, forms, and templates (also called overlays)[14]. There is also a specialized object type for barcodes (called BCOCA) that uses most of the commands within this format. The GOCA format uses draw rules, such as Line, Full Arc, Partial Arc, Fillet and Cubic Bezier Curve.
Input
Most graphic design tools support graphic object output formats. These include Microsoft PowerPoint (PPT native format, XPS, PDF, and XML) through to Adobe InDesign (EPS, PDF). Hard objects (saved in a supported format) can be created using a design tool and then promoted into a production environment. Check your composition tool and/or printer RIP to see what graphic object types are supported.
Composition and/or print stream utilities can generate unique soft objects that are imbedded within the print stream. GOCA objects will be imbedded in AFP files, and EPS could be imbedded in PostScript.
Output
Hard objects are pre-designed, tested and promoted to production resource libraries. Soft (unique to document) objects are generated with the creation (or modification) of the document and carried within document.
Barcodes
We use barcodes on documents, packaging and signage to provide machine readable information. This can be done to:
- assist in the tracking of the document though the manufacturing and distribution process,
- identify the document to the sender, the recipient or interested third parties (IMB codes in address block),
- create a call to action as a part of a marketing campaign (QR Code on a brochure or advertisement),
- carry information in machine readable format that is also presented as text (QR Code with business card information), and
- allow downstream applications to validate the authenticity of the document or capture information off of it.
Reading Barcodes
Barcodes can be read with either an LED Scanner or a digital camera.
LED Scanners fire a constant light signal at the barcode area, and then read the returning light signal that reflects off of the white background (black ink or toner will absorb almost all of the transmitted light). The reader mechanism converts the light/ no light signal into on/off bits. It then translates the on/off bits into alpha-numeric information.
Cameras take a picture of the barcode and then convert the binary matrix into alpha- numeric characters.
Check Bars
Most barcodes use specific bars for registration and validation. Linear barcodes use start and end bars. 2D barcodes use track points, QR codes use three corner squares called Position Detection Patterns, and Data Matrix codes are delineated by two solid adjacent borders in an "L" shape (called the "finder pattern") and two other borders consisting of alternating dark and light "cells" or modules (called the "timing pattern").
Barcode Topography and Readability
Every barcode schema has minimum specifications to maximize readability. The best standard to follow is the one provided in the manufacturer’s specifications for the reader being used. If you don’t know the reader, you should use the industry specification for that barcode. Remember that the faster the reading process, the tighter the tolerance to the spec.
- Orientation: The direction in which the barcodes can be read. Most linear barcodes can be read in the 0 and 90 degree orientation.
- Minimum height: The minimum height of the longest stroke. This will be expressed in inches or millimeters (mm) in the barcode reader specification, although some composition systems may express it in point size (72nds of an inch).
- Some barcodes (e.g. 4SB) have strokes with ascenders and/or descenders. The minimum height would be the height of the stroke with both.
- Minimum stroke width: How wide the black strokes must be, and how much is required between them, to be read. This is usually expressed in inches or millimeters.
- Minimum contrast: The minimum percentage difference between the light (white[15]) background and the dark (black) background. Although most barcode readers work best with 100% contrast, there are tolerances for off-color paper or non-black ink.
- In addition, certain paper characteristics can degrade readability:
- shine caused by calendering or chemicals,
- brightness[16],
- rough finish, and
- surface warping during printing.
- In addition, certain paper characteristics can degrade readability:
- Minimum quiet zone: There needs to be minimum background space round each barcode that is free of anything that contrasts, such as variable print or preprinted information or decoration. You should also ensure that the reverse side is also clear of anything that contrasts. Almost all paper will have some opacity under LED light signals.
- Vector-object barcodes: Barcode readers require good dark to white transition that can be best achieved with crisp line edges. Printer RIPs (and video displays) can best render ‘clean’ barcodes that are described in vector commands.
Barcode Formats
Character Barcodes
The first barcodes were character-based. This was because impact line-printers and early laser printers could only print a series of pre-rasterized characters that had the vertical or horizontal stroke (or strokes) across the entire character box. If identical characters were printed on top of one another you could create a larger barcode. The problem with a character-based barcode is that the RIP has no idea that it is a barcode and cannot properly tune the strokes. Barcode fonts should be avoided.
Image Barcodes
As barcodes became more complex, software was designed to render barcode images that could be presented to the RIP as a binary image. These utilities were good for UPC and other complex interleaved barcodes. They were designed to work on 300 dpi devices with consistent optical density. As printers ran faster and increased to 600 dpi, the barcode image generators had to be updated for the new technology. Problems could occur between vendor technologies (different RIP algorithms; different optical density) because the RIP would likely use the binary image as is. Scaling or rotating the barcode image can completely change its printing characteristics. Barcode images should be avoided.
Vector Graphic Barcodes
Most production printer RIPs can render cleaner and more faithful barcodes from vector commands. Some of the presentation architectures actually support barcodes as a native object-type. Others use the SVG format.
AFP BarCode Object Content Architecture (BCOCA)
The most common native barcode format used for Transaction Document Production is BCOCA. With the object architecture, barcode objects need only to provide the following information so that the RIP can render it:
- schema,
- box size, including quiet space,
- orientation,
- X/Y position on the page, and
- information to be encoded.
Barcode Schema
There are many barcode schemas with different functions and purposes. Some have been developed by specific industries (e.g. UPC by retailers, 4SB by the International Postal Union) or by standards groups (EAN-13 by GS1).
Optical Mark Recognition (OMR)
OMR is the first and most primitive form of variably encoded barcodes. It was implemented prior to the availability of laser printers and deploys two standard characters, a pair of horizontal underscore ‘_’ [17] for barcodes that run vertical and a single vertical line ‘|’ [18] for barcodes that run horizontal.
This barcode is still used with inserting systems on older automated mailing applications. It has limited functionality for the amount of page real estate that is uses, compared to newer barcode schema.
Linear Barcodes
Interleaved 2 of 5
Code 39 (3 of 9)
Code 39 barcode is to track documents through the manufacturing process; primarily at the inserting stations. The schema was developed for Boeing by Intermec to track airplane parts.[19] It was standardized by American National Standards Institute (ANSI) in 1983.[20]
It is built from 39 characters plus additional control characters consisting of the uppercase letters A-Z, the numbers 0-9; space, and special characters: $ / + %. It does not contain a check digit which means that it is more susceptible misreads. It also takes more space to present information than later versions (see below) or 2D barcodes.
The US government specification, LOGMARS[21], is based upon Code 39.
Code 93
This is a compact version of Code 39 that is not widely used.
Code 128
The Code 128 barcode provides higher density information than Code 39. It uses the 128 ASCII character set, and can use the full Latin-1 character set. The GS1-128[22] barcode subset is used worldwide for product container and palette tracking.
Universal Product Code (UPC)
There are a number of variations of UPC. They are used to identify products for sale within retail stores. The barcode is mainly used at the point of sale to add its price to the sales transaction.[23]
UPC-A
This barcode used to identify standard products within retail stores. UPC-A consists of a unique 12 digit number, based on the Global Trade Item Numbers (GTIN) specification. The barcode uses 13 characters plus a check digit. There is room in the specification for 2 or 5 optional digits.
It was developed by the US Uniform Grocery Product Code Council[24], and first implemented in 1974. They are widely used in used in the United States, Canada, the United Kingdom, Australia, New Zealand, and in other countries.
EAN-13
The European Article Numbering (EAN) 13 barcode is a superset of the US developed UPC-A barcode. The barcode adds an extra digit at the beginning of the code, allowing for 10 trillion variations.
The EAN-13 code can be used with most North American UPC scanning systems.
UPC-E (End-bit)
The UPC-E barcode is used to on small packaging on which an ‘in spec’ UPC-A barcode will not fit. It cannot be used as a substitute for EAN-13. It uses the end-bit to convert the binary value of the displayed six digits.
Two Dimensional (2-D)
PDF 417
The PDF-417 font is used by a number of industries and government organizations. The airline industry uses it for boarding passes. The US Department of Homeland Security has selected it for RealD[25] compliant drivers’ licenses and other identity cards. It was developed by Symbol Technologies in 1991 and is an ISO standard[26]. It supports all ASCII characters and includes error correction up to about 1850 ASCII or 2725 numeric characters.
Data Matrix
The Data Matrix barcode is used for labeling small items (e.g. electronics) or items where the barcode is unobtrusive (e.g. documents within a manufacturing process). The US Electronic Industries Association endorses its use for electronic parts identification. It is also used extensively within the food industry It can carry a great deal of information in a small space.
The barcode is an ISO standard[27]. It uses dark and light cells, with the left and bottom edge cells all darkened (to indicate the orientation). The information to be encoded can be up to 2,335 characters of alphanumeric data, including error correction codes.
Quick Response (QR Code)
3D
High Capacity Color Barcode
Disguised Barcodes
Microglyph (formerly Dataglyph)
The microglyph format can be used to digitally sign documents in a way that is unobtrusive. A microglyph object could be applied, as a hard resource, to a series of documents to authenticate the issuer of the document. A unique microglyph could be generated for each document, as a soft resource, to authenticate both issuer and the intended recipient.
The Dataglyph was invented at Xerox Palo Alto Research Centre, and offered by Xerox as a proprietary process[28]. Xerox has since licensed Microglyph Technology GmBH[29], who enhanced the technology and renamed it ‘microglyph®’.
The schema uses extremely small forward ‘/’ and back ‘\’ strokes of different widths to simulate pixels within a photo or line art. The barcode interpreter determines which way the stroke slants, and translates them to bits. The bit stream is then converted to data. The data is usually encrypted.
Raster Objects
One of the easiest ways to describe information to print is as an image. We have been transmitting and printing documents since the early 1920s. The first technology, Wirephoto, used an analog signal to scan and transmit black on white images to the recipient device. These technologies, along with analog television cameras, were the starting-point for image capture technologies and the standards that morphed into many of the digital raster image object-types that we use today.
In the early days of laser printing we used these technologies and formats to create most of the document objects we needed. Raster objects minimized rendering, but rasterized fonts, logos and business graphics were pre-rendered to the resolution of the printing device and potentially tuned to its idiosyncrasies.
If a print job was targeted to a print device with a different print resolution the rasterized objects had to re-rendered in the new resolution. A common challenge for companies using AFP workflows came when the common machine resolution changed from 240 dots per inch squared (dpi) to 300 dpi. Re-rasterizing of any 240 dpi fonts, logos or raster graphics was a time-consuming process fraught with potential errors.
Attempts to transform from one resolution to another often produced mediocre results. Rescanning was not always an option as original artwork was often not available, and many object issues were only discovered during production testing (or even in actual production).
Today, we try to limit the use of raster objects to scanned documents and signatures (usually stored as TIFFs), and pictures (stored as JPEGs).
The best practices is to avoid using raster objects for fonts, barcodes and business graphics or line drawings and use vector-based objects. Printer RIPs can render vector versions of these objects more accurately than raster versions.
Scanned Raster Objects
Organizations capture an enormous number of paper based documents via scanners. Scanned documents can be more easily managed as part of business workflow and/or archived as corporate records. These documents are usually stored in either TIFF or PDF Image[30] format.
Outbound documents often include scanned images from these systems. Examples include check images printed as part of a monthly bank statement, client-signed shipping manifests printed as backup to an invoice, and insurance applications appended to the issued policy.
To effectively use these scanned pages as document objects, it is important to understand in what format, and at what resolution, the original document was stored. Many documents are scanned at a low resolution (readable on a display, but fuzzy when printed at 600 DPI) to minimize storage and transmission size. They will look like a FAX if printed at their original size since most RIPs will not apply anti-aliasing to TIFF (or JPEG) objects under the assumption that this could be interpreted as tampering with the original rendition.
Tagged Image File Format (TIFF)[31]
TIFF was originally developed by Aldus Corporation, which was acquired by Adobe Corporation in September 1994. It was promoted as a universal scanning format for desktop publishing and was adopted by most scanning technology companies in the mid-1980s.
Its original format was binary dot placements (1 for black, 0 for white), but evolved to support grey-scale dot placement (using either a half or full byte to describe the shade of grey)[32], and eventually color.
It is the preferred format for storing scanned documents because it uses lossless compression algorithms as its default. You can use the JPEG (Lossy) compression format within TIFF when encapsulating photographs.
There are many extended versions of TIFF generated by vendor software. A best practice is to use Baseline TIFF[33] as the RIP may not be able to faithfully render extended functions such as layers or JPEG and LZW compression.
TIFF Extensions
TIFFs can have multiple sub-files (usually pages of a document) within the object. Multi-page TIFFs require pre-processing to generate individual object for each page before use as document objects.
The most popular compression formats for scanned documents are CCITT G3.1 (specified within base-line TIFF), and CCITT T.4 / T.6 (allowed as supplier extensions, but not universally supported). It is important known whether the page images to be used are the latter and if this impacts the RIPs rendering of the object.
You may run into TIFF files that are compressed using Apple’s Packbits format (specified within baseline TIFF).There are also TIFF extensions that support uncompressed CMYK layers (4 layers, each supporting an 8 bit color channel) within an object. Vendors can also add private tag metadata to the TIFF file to exploit proprietary functions.
For any extensions to TIFF it will be important to determine that the RIP will be able to process it without error.
Scanned Portable Document Format Files
Adobe controls both the PDF and TIFF format. The TIFF format has been stabilized since 1996, and that format is used as a subcomponent within PDF files.
Most new document scanning applications use the PDF format as the container for a scanned document. Within that document, PDF uses Baseline TIFF images for each scanned page. The scanned document can then contain metadata such as indexing information, and text information that emerges from Optical Character Recognition.
As with multi-page TIFFs, you will need to pre-process these PDFs to get to the individual TIFF objects. Most likely you’ll need to deconstruct the PDF files into a series of TIFF objects. Within the AFP environment, however, you can pass the entire PDF file to the RIP and instruct it to render only on a specific page. This avoids the need to deconstruct PDF image files.
Joint Photography Experts Group (JPEG) Format
JPEG is both the common lossy compression format for digital photography (used within many different image file formats, such as TIFF and RAW), and a file format in its own right. The latest JPEG standards can be found in ISO/IEC 10918.[34]
The JPEG format accurately renders photographs and paintings with realistic variations in hue and color. However, it is not optimized for line drawings or scanned documents.
JPEG photographs captured using digital cameras are saved in the JPEG Exif format, which is restricted to 24-bit color (8 bit shading for the R, G and B color channels). However, many professional cameras now support 32-bit color (8 bit shading for the R, G, B and Alpha (pure white) color channels). 32-bit color photos may have the .jpg file extension but are likely RAW format inside. Exif formatted photos do not contain an ICC color profile. You will need to add the desired color profile to the JPEG metadata before using it as an object. There is a wide compression range within the JPEG format. The highest quality uses all 8 bits for each color dot placement (plus about 10% overhead, making the file size about 9 bits per color dot). Lower quality photos can be scaled down to less than 1 bit per color dot. Photos with severe compression will not render well on color printers (although they will look fine when viewed on a smartphone or tablet).
It is important to understand that the following items will impact speed and fidelity of rendering at the RIP:
- The source picture size is the same as the size when placed. Although RIPs can rescale JPEGs, this takes an enormous amount of computing, and the results can be disappointing. If you want to place a 1 inch square picture, make sure it’s pre-rendered to that size.
- The severity of compression. The less the compression, the more faithful the rendition and the less likely that you’ll see artifacts on color boundaries.
- 32 bit color (RGBA). Although RGBA photos look more colorful when viewed on the latest LED displays (and will look amazing on the soon-to-be-released OLED displays[35]), the Alpha channel can cause havoc at the RIP.
RAW Image Format
These files (sometimes called digital negatives) are the minimally processed photographic files captured by a camera or scanner. RAW files usually have a much wider color gamut than JPEGs. However, there are numerous RAW formats used amongst the camera vendors. For that reason, they must be further processed to be usable, usually transformed into (or encapsulated within) JPEGs.
AFP Image Object Content Architecture
“The Image Object Content Architecture (IOCA) has been formulated to provide a format suited for high speed printing. IOCA contains enough flexibility that a wide variety of images can be printed, but formats images in such a way that they can be printed efficiently and with minimal processing.”[36]
IOCA provides a common container for the many different types of image formats, allowing them to be properly rendered by the RIP.
IOCA supports AFP-specific as well as many of the formats described above. However, TIFF, JPEG and other non-AFP file types must be encapsulated in an IOCA wrapper to be used. Image types supported include:
- JPEG algorithms,
- TIFF PackBits,
- TIFF LZW,
- CCITT Group 3,
- Adaptive Bi-level Image Compression (ABIC) – extensively used in bank check image capture systems
- InfoPrint MMR—Modified Modified Read
- No compression
- Run Length 4
- G3 MR—Modified READ
- G4 MMR—Modified Modified READ
- JBIG2
Color is extensively supported in IOCA Function Set 45.[37]
Page Segments (PSEG)
The original format for (uncompressed) IBM AFP image objects.
.IMG
The format for image objects with the Xerox Metacode environment.
Other File Formats
You may run across images in the following formats. Most likely you will need to transform them into TIFF or JPEG format they can be rendered.
- GIF
- BMP
- PNG
- PPM, PGM, PBM, PNM and PFM
- PAM
Input
Image objects are usually captured using
- Digital cameras,
- Scanners, and
- Graphic design software.
Compound Objects
The documents we build today are like Russian matryoshka dolls; we nest logical subset objects within larger objects. For example, within a file containing thousands of documents, we find each document nested inside with a beginning and ending tag. Within each document, we find each page with beginning and ending tags. Within a page we find text block, graphic and image objects presented (or referenced).
The advantage of building documents out of nested objects is that it is easier to develop each object independently. We can build the document’s address block object independently and then place it into position independent of other page objects (such as tables, messages, contact information, logo, page number, and packaging barcode) to compose the page. This allows programmers to change the position of the address block without having to adjust the position of other page objects. It allows print stream utilities to find the address block within the document and update/correct the mailing address without impacting other objects within the page. And, it allows the RIP to optimize how it renders the objects, such as pre-building the logo object once, caching it, and calling it whenever it is referenced.
The first generations of transaction documents, starting in 1958 with the introduction of the IBM 1403 impact printer and continuing to the introduction of AFPDS in 1984, were created in much the same way as the documents they replaced; they were typed using steel slugs. In countries using European alphabets the print file started on the upper left-hand corner of the paper and added characters (and spaces) until it reached the end of the text line, and then placed a carriage return command to move the print carriage to the next line. Sophisticated programs may have had skip to next page commands. The printers of that era used steel characters held together on a print- train. The fonts were hard objects that were hammered into an inked ribbon in front of the paper. If you wanted to change the font you could only do so before you started the print job (by removing the print-train and installing another). Early print files did not delineate the beginning/end of documents or even pages. To move the address block it might be necessary to shift multiple print lines, potentially misaligning all of the trailing information because it had to be exactly placed onto a pre-printed form. Although this may seem like ancient history, there are hundreds of 1403-style print files printed every month in large enterprises.
In a recent survey of its clients, Acadami found that approximately 90% of their clients’ mainframe print jobs (by number, not volume) were described as either line-mode, 1403, or AFP Compatibility (line mode with pagedefs/formdefs).
Building with Objects
With the introduction of electrophotographic laser printers that could address any point on the page, the technical constraints of impact character printing were eliminated. The RIP could place information anywhere on the page as it rendered the raster pattern.
By using multiple objects to build pages and documents, file sizes can be structured to minimize redundant instances of resources, saving significant space. Documents and pages are identified, allowing a print controller or a document viewing application (such as Acrobat) to interpret a specific document or page only, and its referenced resources.
By identifying hard or multiple-use resource objects at the start of the presentation file, the RIP only has to render repeated resources once, and then cache them for use when referenced. This speeds up the rendering process while allowing the RIP to better calibrate (e.g. ICC color profile[38]).
Text Block Objects are anchored by referencing a starting X-Y position. The position can be absolute (positioned from the upper left hand corner of the page) or relative (positioned in relation to another object). To adjust the address block object on page 1 in Figure 3, change its start position from the start of the page (an absolute positioning). Or, widen the space between the bottom of Table 1 (in Figure 3) and the start position of Table 2 by changing the relative position.
In the case of the Barcode object on page 1 of Figure 3, the object can reserve space around it. This keeps other objects from interfering with its quiet zone, an area surrounding the barcode object that is deliberately left blank to ensure a proper read.
Components within a Compound Object
Compound objects should be considered unique entities. It should contain either all of the resource objects and data to be placed, or it should reference hard resource objects. A compound object should not reference resource objects found in other compound objects within the presentation file; e.g. Table 1 in Figure 4 should not reference a font object that was carried within the address block object. These objects may be replaced over time with unintended consequences.
A compound object will position the beginning position either absolute to the beginning of page, or relative to another object (usually the previous object’s end position). It will also specify its orientation relative to the orientation of the page. For example, the compound object carrying the Barcode in Figure 3, is set at a 90° degree orientation to the page. All other compound objects are set at 0°.
A compound object positions information (Data) within the object. This can be absolute, as in Line 1 in Figure 4, or it can be relative, as in Line 2.
A compound object can reference a Font or Color Palette, either to be used within the entire object or by specific text placement. It can also carry these resources for its exclusive use.
Graphic, Barcode or Image objects can either be referenced or carried.
Types of Compound Objects
Adobe Portable Document format (PDF)
Although we think of PDF as a Presentation Stream or Format, a PDF object can be carried within a page or document of a larger PDF file or an AFPDS file.
One of the strengths of PDF the ability to containerize resource objects and information, and to isolate hard resource objects at the beginning of the presentation environment (file). PDF actually carries most of its hard resources between page 1 and 2 of the file. It only carries the resources necessary to build page 1 at the beginning. This allows Acrobat to quickly render page 1 for the recipient, giving a respite to load the rest of the resources before the recipient views page 2.
Color Objects
Understanding Color
Color is the most difficult object-type to replicate, adding significant time and resources to both implementation testing and ongoing production.
Psychology of Color
Color emotionally affects us. Some colors, like red, make us more attentive. Others, like blue, make us more imaginative. Warm colors can make us energetic or happy, while cool colors can make us relaxed. The lack of color can cause depression, like Seasonal Affective Disorder, that is treated with the use of bright light.
Microsoft proved that you can enhance or inhibit curiosity through the amount of red within the blue used on web links. They drove over $80 million in increased click- through revenue on their Bing search engine by shifting from reddish blue to greenish blue.
Pharmaceutical companies can demonstrate statistical differences based on the color of the pill alone. Although white is most commonly used (because we consider it pure or clean), warm colors are sometimes used to inspire hope. Light blue is sometimes used to signify calmness.
Corporate Identity
Organizations use one or two specific colors as part of their corporate identity, usually defined in a Corporate Style Guide . Much thought is put into the specific color combinations. What psychological message does it provide? Consider Starbucks green, intended to encourage customers to ‘take a calming break’. For some logos the color has historical connections, such as Coca Cola red. Some logos do both, such as the IBM eight-bar blue on white logo, evolved from the Trusted IBM Salesman stereotype: blue suit, white shirt, and striped tie.
Corporate identity colors are used for more than the logo. They are also used to inform a variety of document elements, including:
- headings,
- call-out text,
- lines or boxes in tables,
- business graphs,
- line-drawings, and
- photographs, either as the:
- shade color in a monochrome picture,
- dominant color in a full color picture.
In situations where you see only one color used on a white background, the secondary color listed in the corporate style guide is often white. The guide may also allow the use of shades and tints of the corporate identity color; the color hue doesn’t change but the luminance does.
Using Color Objects
Moving from black on white to full color printing processes creates the need to be aware of the different points in the process where color objects may be built, stored, re-profiled and otherwise touched. Here are some touch points:
- Document object management system: Manages color palettes and profiles.
- Composition system: May need to define color palettes and profiles for print or display devices.
- Print stream utility: May need to convert color definitions for finished product.
- Print management system: Must understand color attributes before assigning color print device.
- Archive system: Must be able to faithfully render colors as printed or displayed.
- Electronic presentation system: Must convert color profile for web presentation.
- Printer: Must convert RGB color profile to CMYK before rasterizing page.
- Inserter: May need to read color marks as part of manufacturing process, and could print colors onto envelopes.
The only steps that are color-blind are generally the data extract (ETL) routine and the postal delivery.
How Color Works
To appreciate the power of color, understand the science behind it.
Color Wave Lengths
Color is electromagnetic radiation somewhere between x-rays and radio waves (400 - 700 nm). It is also known as the visible spectrum. It’s the tiny portion of radiation that we can actually capture with our eyes. Between 400 nanometres (violet) and 700 nanometres (red) are the color wavelengths that spread out into the spectrum of a rainbow.
Seeing Color
There is a matrix of photoreceptor cells called rods and cones in the back of our eyes. There are about 120 million rods and 6 to 7 million cones in each human eye.
The rods are one thousand times more sensitive to the intensity of light, but insensitive to color. They differentiate images as black, white and different shades of grey. They pick up luminance within color, such as black text on a white background or the shade / tint added to a color hue. They also become increasingly important in lower lighting.
Each of the cones contains one of three pigments sensitive to RED, GREEN or BLUE. Each pigment absorbs a particular wavelength of color. The short wavelength cones absorb blue light, the middle wavelength cones green light, and the long wavelength cones red light.
A practical application of this is that when designing transaction documents for low- light environments, make sure that the text is absolute black and the paper is absolute white.
Colors of Transmission – Red, Green, and Blue
Red, Green, and Blue are the three pigments in the cones of the eye and also the three primary colors we use to transmit color. When all three colors are transmitted onto a white surface, we should see white where they intersect. These colors are usually referred to as channels, e.g. the red channel.
Where only two of the three color channels intersect, there are the three color channels of absorption, Blue + Green = Cyan, Green + Red = Yellow, and Red + Blue = Magenta.
So, why are bananas yellow?
Bananas look yellow because their surface absorbs light-waves between 400 and 500 nanometers, which is the blue spectrum. The blue is removed by a yellow filtering process from the yellow pigments in the banana skin. As they go brown, the surface begins to absorb red and green.
RGB Values
Specific hues are achieved by adjusting the intensity of the color transmission. This can be described in two ways:
- In most publishing applications a hue is described by assigning a value from 0 – 255 (a byte of information) for each of the color channels, e.g. Cyan is R 000, G 255, B 255.
- In a web application a hue is described using the hex representation for each color byte, e.g. Cyan is: 00FFFF. This is called a hex triplet.
The Fourth Color: Alpha (white)[39]
The Alpha Channel is the 8-bit layer in a graphic file format that describes translucency. The additional eight bits per pixel serve as a mask and represent 256 translucency levels from entirely clear (0) to opaque (255), with levels in between representing the degree of haziness. RGBA uses a byte to describe the four color channels, e.g. R 000, G 255, B 255, A 127. Graphics in TIFF, PNG and TARGA formats support the 8-bit alpha channel; while GIF formats support a 1-bit channel to indicate transparency/no transparency. JPEG formats do not support transparency.
Colors of Absorption
Although we use light to transmit color, we use pigments to filter out light and reflect the remaining color. On printed documents, ink and toner reflect color because they contain pigments that absorb transmitted colors. This is known as absorption.
Although any color pigment will absorb light, we normally use permutations of three colors to create a wide range of hues: Cyan, Magenta, and Yellow. They are known as the primary colors of absorption.
The Fourth Color: Black
Black is used to both shade hues (decreasing their luminance) and to provide an absolute black. The letter K is used to represent black, which is a legacy from the early days of printing when the black plate was used as the key for alignment.
Most of the readable information is printed in black. Using the black color channel to produce the black text will provide the best transition from black to white, making the document easier to read.
When setting the CMYK value for black text, ensure that it is “0, 0, 0, 100”.
Theoretically, you can mix cyan, magenta, and yellow to get black. In the real world, the mix does not fully absorb light and you end up with a brownish hue.
CMYK Color Values
Unlike RGB, we describe the amount of colour applied in each channel as coverage percentage. For example, Green would be described as C 100%, M 0%, Y 100%, K 0%, or “100, 0, 100, 0”.
Spot Color
Pantone
Pantone Inc. provides a process for standardized spot colors of absorption. Pantone colors can be specified to a commercial printer, an ink manufacturer, a paint company, or a textile manufacturer and what is provided should arrive with consistent color.
Pantone has introduced two separate color standards over its history.
Pantone Matching System
The Pantone Matching System (PMS) was introduced in the 1960s and uses 13 base pigments (plus 2 for white and black) to produce 1,114 spot colors.
It can be difficult to match PMS colors identified in corporate style guides. They are approximated with the CMYK printing process as the 15 different pigments do not directly map to CMYK. In addition, the exact formulation is Pantone proprietary information and can change with each type of target substrate.
Spot colors are optimized for solid areas of colour. They’re brighter and crisper than combined colors; it’s difficult to duplicate them with CMYK printing.
Most PMS colors cannot be matched in RGB, forcing most style guides to list an RGB approximation for web presentation and electronic documents. We can use the RGB value when profiling CMYK printing.
Pantone Plus
PMS was updated in 2010 with the introduction of Pantone Plus. It provides an additional 560 colors that are based on the same 13 + 2 pigment formulation.
Pantone Goe System
The Pantone Goe System, introduced in 2007, uses only 10 pigments plus clear coating for reflection. The new system was developed to be more compatible with color management software. Pantone has not matched the older PMS to the new Goe colors, stating that they do not always match. They advise organizations to continue to use the original PMS number for existing color specifications (such as a corporate logo) and use the Goe system numbers when picking a new color.
ICC Color Profiles[40] – Bridging between RGB and CMYK
We create color objects in an RGB environment but need to produce them in CMYK. The RGB devices that we use are constantly improving in terms of the color depth, known as gamut, which they can display. Each display technology has significant differences in color channel transmission. They also degrade over time because of burn- in and other factors.
CMYK process color printing also has idiosyncrasies. Inkjet, electrophotography, and lithographic processes have different generic gamuts. Each supplier has their own tweaks to the process. In addition, the paper used will change the results based on color, opacity and gloss.
Although manufacturers provide ICC Color Profiles for their devices, many organizations use spectrometers to adjust the profile on a specific monitor or for a printer / paper stock combination.
The International Color Consortium (ICC) has developed a standard[41] for mapping a (usually RGB) device’s color space using the CIE XYZ color gamut[42] as the connecting color space, and then using it to create a target profile (usually CMYK).
Compensating for Color Blindness
Approximately 7% of the Caucasian males are color blind (and only 1 in 10,000 females). Color blindness is a genetic abnormality in the cones of the eye.
When planning to use color to highlight or direct the reader, add additional clues:
- avoid mixing pale green and red,
- have at least 60% contrast between text and background,
- graphics:
- use Different line weights and line types (solid, dashes, dots),
- use Patterns in color areas on bar or pie-charts, and
- attach labels with leader lines.
- text:
- italics,
- bold, and
- different point size.
The best way to see if a color-blind person can understand the color information is to make a black on white photocopy. If you can still understand it, the clues should be understood by someone who is color blind.
AFP Color Management Object Content Architecture (CMOCA)
“A resource architecture used to carry the color management information required to render presentation data… It defines objects that provide color management in presentation environments. These objects are called Color Management Resources (CMRs).[43]”
The CMR is imbedded within an object container, describing its color characteristics. These containers include the print file, individual documents, one or more pages, and specific objects such as images, graphics, fonts and barcodes.
CMOCA supports TIFF and JPEG image objects.
References
- ↑ ISO/IEC 9541 defines a method of naming glyphs and glyph collections, independent of any document encoding technique; it assumes that one or more methods of associating document encoding techniques with glyph identifiers used in font resources will be provided by text processing systems. ISO/IEC 9541-1:2012 specifies the architecture of a font resource, i.e. the font description, font metrics, glyph description and glyph metrics properties required for font references and the interchange of font resources.
- ↑ Typographic Design: Form and Communication 2nd ed. John Wiley & Sons: 1993
- ↑ Typography: how to make it most legible – 5th Edition, pp 16- 18, Rolf F. Rehe, Design Research International, 1984
- ↑ The most popular fonts used by designers, Web Designer Depot, Cameron Chapman, 2011. http://www.webdesignerdepot.com/2011/08/the-most-popular-fonts-used-by-designers/Ibid.
- ↑ Ibid.
- ↑ Font designers usually copyright their font designs, trademark the name, and then license document producers to use the fonts under specific conditions. For example the Frutiger font was designed by Adrian Frutiger under commission to the Charles de Galle Airport. The font is distributed primarily by Linotype (now owned by the parent of Monotype). The licensing agreements for Frutiger (and other Monotype licensed fonts) can be found at www.fonts.com.
- ↑ What is OpenType: http://www.microsoft.com/typography/WhatIsOpenType.mspx
- ↑ ISO/IEC 14496-22 (MPEG-4 Part 22)
- ↑ http://www.w3.org/TR/2012/REC-WOFF-20121213/
- ↑ True Type Specification 1.66 http://www.microsoft.com/typography/SpecificationsOverview.mspx
- ↑ Font Object Content Architecture Reference, 2005 http://afpcinc.org/site/assets/files/1129/ibm_foca_focaref5.pdf
- ↑ http://www.microsoft.com/typography/fonts/
- ↑ W3C Scalable Vector Graphics (SVG) 1.1 Specification http://www.w3.org/TR/SVG/ . Version 1 was adopted in 2001 and Version 1.1 in 2003. SVG 2 is scheduled for recommendation in August 2014, and will better integrate with CSS, HTML5 and WOFF. It is scheduled as “recommendation” for August 2014, according to the W3C SVG Roadmap.
- ↑ Graphics Object Content Architecture for Advanced Function Presentation Reference, January 2012 http://afpcinc.org/site/assets/files/1130/goca2ref.pdf
- ↑ Absolute white paper should measure 154 on the CIE Color Gamut
- ↑ Most xerographic paper has ‘blue’ fluorescent whitener added to make it ‘brighter’. As a result ‘brightened’ paper can actually measure above 100% reflectivity due to the introduction fluorescent blue dye.
- ↑ ASCII x5F and also called the ‘low line’; not the ‘em-dash’ x97 nor the ‘macron’ xAF
- ↑ ASCII x7C
- ↑ Introduced in 1974 as a font.
- ↑ ANSI MH 10.8 M-1983
- ↑ ANSI/AIM BC1/1995, Uniform Symbology Specification - Code 39
- ↑ GS1-128 barcode specification http://www.gs1.org/docs/gsmp/barcodes/GS1_General_Specifications.pdf
- ↑ note: and the receipt presented is an actual ‘transactional document’.
- ↑ Now known as GS-1 US http://www.gs1us.org/
- ↑ US Federal REAL ID Act of 2005. http://www.gpo.gov/fdsys/pkg/PLAW-109publ13/html/PLAW109publ13.htm
- ↑ ISO standard 15438
- ↑ ISO/IEC 16022:2006—Data Matrix bar code symbology specification ISO/IEC 15415—2-D Print Quality Standard
- ↑ Can only be rendered on Xerox RIPs
- ↑ http://www.microglyphs.com/english/html/company.shtml
- ↑ The document can be stored as a multi-page TIFF Each page is a separate raster image.
- ↑ For additional TIFF material consult: http://partners.adobe.com/public/developer/tiff/index.html#spec
- ↑ A half-byte describes 16 and a full-byte describes 256 shades.
- ↑ TIFF Revision G Final http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf
- ↑ The latest section is ISO/IEC 10918-6:2013 -- Digital compression and coding of continuous-tone still images: Application to printing systems.
- ↑ OLED display technology uses four LED colors for each pixel, including pure white. This allows it to faithfully replicate the Alpha channel in RGBA photography (and video).
- ↑ Image Object Content Architecture Reference - Release 6.0 - S550-1142-00, page 7. http://afpcinc.org/+site/assets/files/1128/ibm_ioca_55011420.pdf
- ↑ Functional Set 45 describes bilevel or color tiled images
- ↑ This means that the RIP doesn’t have to do a complex ICC conversion each time it is given a colour attribute.
- ↑ http://www.pcmag.com/encyclopedia/term/37669/alpha-channel
- ↑ At the time of printing, the current version is ICC.1:2004-10 (Version 4.2.0.0)
- ↑ Draft International Standard ISO 15076-1:2005.
- ↑ Introduced by the International Commission on Illumination (CIE) in 1931.
- ↑ Color Management Object Content Architecture Reference AFPC-0006-01. http://www.afpcinc.org/site/+assets/files/1068/cmoca_reference-01.pdf