Editing tables of data

by Irene Wong

“There is a rea1 gulf between the statistical and non-statistica! worlds of ideas, and the statistician often finds it difficult to project his ideas across that gulf. This is perhaps partly his own fault in that his jargon, like all scientific terminology, tends to intensify the difficulties.’ W J Reichmann (1961)

Although writtcn in 1961, this is still true (except that we now have female statisticians also perpetuating this fault!). Editors can fill the gulf.

Tables should allow readers to easily and accurately:

  • see what subject matter and variables are being described
  • find out absolute values
  • observe relationships between variables

When you edit a table, it is useful to assess just how well it achieves these ends. Readers will feel confident with your table if they can quickly navigate around and absorb the data.

Carolyn Rude (1991) has suggested that tables should be edited within “the familiar copyediting realms” for:

  • correctness
  • consistency
  • accuracy and completeness
  • visual readability

Tbe distinction between these edits is blurred and you may, for examp1e, look at accuracy and consistency at the same time. How you edit will depend on who your readers are, the subject matter, the size of the job and time available.

It is impossible to draw up a general checklist for editing all tables. However, Rude’s general principles provide a useful framework around which to edit most tables.


Editing tables is much like editing text, but you need to be aware of the many problems specific to the readability and retrievability of data in tables.


The editing you do will also depend on the type of table because a statistical table used for further analysis has a different aim from tables of information in which you look up a single fact. Even within statistical tables, some may relate to a certain argument and are “demonstration tables” (Chapman and Mahon, 1986). Others may present a wide array of information from which readers draw their own conclusions.

Click here for a diagram of the parts of a typical statistical table.

The eventual format of the data (for example, hard copy, on-line or CD-ROM) will also be a major influence on your approach as will the origins of the data.

You will find it better to edit a set of tables and accompanying text in one sitting. Your memory and ability to bring together apparently unconnected facts will not be strained. In one session it is easier to gather all the threads that you want to compare.

Observe tables you find in the course of your daily life and note how you react to them. Were you able, for example, to quickly assimilate the information that was presented in tabular form on the election night telecast? Consider why this was your reaction.

Subject matter knowledge

The argument about the need for editors to have a knowledge of the subject is often raised. However, there is something quite universal about a table.

Technical experts become excited, in fact delighted, with their statistics, the computer program which generated them, or even the fact that their manager approved of the work! This enthusiasm needs to be directed towards readers and not the statistics or program as an end in itself. As editor you have the responsibility to channel their euphoria into an informative and navigable piece of information.

On the other hand, they might be glad to see the end of the statistics because they have been through a long process to collate them. They might be just about to move on to another collection and they have forgotten that the whole purpose of compiling the data was for it to be published. They might even be bored with the data by the time you are asked to edit their tables.

Editing tables is like editing text. You may have to work hard at cajoling authors and technical experts into assisting you.

Depending on the content, it is generally unnecessary to be a mathematician or statistician to edit tables. Confidence with basic arithmetic is a great help. Tables contain patterns which editors should quickly observe and use to their own advantage.

It is better to be involved in the early stages of planning the statistics, so you know how the data were collected and compiled. This background is useful because you will be aware of possible problems, shortcomings and features of the final data. It’s also good to be involved early in the planning of the final output and preparing a profile of your readers and their needs.


Edit a set of tables and accompanying text in one sitting, because it is easier to gather all the threads that you want to compare.


Correctness

Title
Does the title include ‘what’, ‘where’, ‘when’ and any other necessary identifiers?
Do not rely on readers knowing that the table refers to a particular geographical area because the whole of that chapter relates to that region. Readers may just glance at the table and ignore the fact that it is printed in the chapter on NSW.
All titles in all tables should be written and formatted consistently.

Units of collection
If the units of measure are not in the title, do they appear at the top of columns?
Are the figures meaningful to the reader? Might it be better to explain that a wizzet is about the diameter of a human hair so that readers can appreciate how small a wizzet is?
Consider how useful the unit of measure is. For example, where data are collected in kilograms consider whether readers need to know the total was 6,789,123 kilograms. A rounded figure of 6,800 tonnes may suffice. Consider what unit the industry deals in. For example, gold is measured in ounces. However, you could publish in ‘000 oz which would be easier to absorb.

Rounding
If you are using estimated figures, it isn’t realistic to publish data to 4 decimal points.
Rounded figures should not be used for calculations such as totals or proportions. For example, if you add up 10 figures which have all been rounded up, your total will be different from the result obtained from adding the unrounded figures.
If data have been rounded, the publication should state that differences may occur between sums of component items and totals because of rounding.
Remember that rounded figures are easier to use when you do mental arithmetic. It is easier to subtract 56 from 89 than 56.345 from 88.987.

Blank cells
Are any cells left blank?
If yes, identify the reason; for example, the information is confidential, is zero or rounded to zero, is not available yet or is not applicable (such as zero pregnancies for males). No cell should remain blank. Footnotes or standard notations should be inserted into all cells which contain no data or zero.
If there are many empty cells, perhaps the whole table needs redesigning or broader levels of disclosure should be adopted. It depends on what you are trying to tell your reader. There are many instances where a nil value is a very useful piece of information.

Footnotes
Check that all footnotes are included and are numbered.

Revisions, preliminary data, estimates
Do you want to highlight revised, preliminary or estimated data? Establish a style for doing this.

Source
If you need to identify the source of the information, perhaps for copyright reasons, check that acknowledgements are included and are in a consistent format.

Index numbers
Carefully consider how information drawn from index numbers is presented.
Press reports about the Consumer Price Index (CPI) give percentage increases over a quarter or year. Only in a more detailed commentary is there any attempt to give actual index numbers. Some readers will not realise that the CPI is published as an index number, such as 197.6 for a quarter, which indicates the increase since the index was started in the base year when it was 100.
If the table does present index numbers, check that the base for the data is included.

Totals
Check that there are totals where they are needed. Think about whether you should include subtotals.

Percentages, ratios, averages etc
Consider readers’ needs in terms of percentages or ratios rather than absolute values or totals. Don’t expect them to have to calculate these values if there are standard ratios or calculations in that subject. For example, in a table showing population by local government area (lga) readers may also want to know population densities. The area of all lgas should be provided (this is sometimes called the `denominator’) or, better still, include the density in the table.
Another reason for inserting percentages is that readers unfamiliar with the subject matter will be guided in their conclusions and observations. The CPI is an example of where percentage change, rather than an index number, is more informative for many readers.
You may have to make more complicated calculations to give your reader the information they really need. For example, it is often more informative to look at value added (that is, turnover plus increase, or less decrease, in the value of stocks, less purchases, transfers in, and selected expenses) per employee. Value added gives a better indication of production than turnover does. (A company with a turnover of $100 million may not really be adding as much value to materials as a company with a turnover of $20 million.) However, since the calculation of value added involves addition and subtraction, the value should be prepared by the author and inserted.
Averages are often more useful for comparing data in a long time series. For weather or agricultural statistics, averages over time are more meaningful.

Mental arithmetic
If the differences between two rows have been calculated, consider whether it is useful to put + (plus) as well as – (minus) before the results.
If readers need to subtract or compare rows, try to place these rows next to each other. At school we were taught with the larger figure on top and shown how to subtract the bottom figure. This is the best way to present data that readers might have to perform mental arithmetic on.
Figures in a column are generally closer together than adjoining figures in a row. Therefore it is easier to make calculations with figures above or below each other than with those side by side. If figures which may be used in a calculation are far apart in a column, consider reducing the leading to bring them closer together.
Be careful with horizontal and vertical lines. They tend to isolate figures from each other.

Parentheses, brackets and braces
Are these included? They may be around negative values or grouped entries.

Placement of the tables
Check that the tables are in the right place. If they are supposed to be next to particular text, check that they are. If they are together, are they in a suitable sequence? Consider their placement opposite each other on a double page spread.

Cross references
Check that all cross references are correct.

Size
Is the table a suitable size? For example, would it be better to split it into three smaller tables or should some tables be combined? Small tables are often easier to position in between text, just where the information is described.
If you are trying to present a particular argument, smaller tables will always work better. However, if you are providing reference information, a large table will be easier for readers to navigate and to discover relationships in.
A table which runs over two pages will generally be better split into two or more tables. However, some tables will run over many pages and you should check that the page breaks are in a suitable place and that the footnotes are easily accessible even to the laziest reader; that is, on the same page as the information. All pages should have the title and column headings reprinted on them.

Shape
Generally you should try to avoid landscape tables (those turned sideways on the page). In a publication where all the pages are in portrait orientation, the sudden appearance of a landscape page means the book to be rotated. This is undesirable.
A landscape table could be split into separate tables or could be spread across two portrait pages. If you do this consider repeating the stubs down the right hand side of the second page. There are problems with this approach though. Firstly, you need to be quite certain that rows will perfectly align after printing and binding. Secondly, you restrict your paging options if you are forced to create two-page spreads.
Another solution to a poorly shaped table is to consider whether the stubs and headings should be reversed. However, to maintain consistency between tables this may not be an option. If you have a long narrow table which you would prefer to fit across the page, design it as follows:

stub   data       stub   data
stub   data       stub   data
stub   data       stub   data
stub   data       stub   data

If you put the flowing stubs in italics, your reader will realise there is something different happening here. You could put a light vertical line to the left of the second row of stubs.

Unnecessary information
Consider if the table is necessary at all. Perhaps all the information is in the text or has been presented graphically. Also check that the reverse is not true. Just because the data have been graphed, it doesn’t mean that readers do not want the exact data to use in another situation.
Should some information be deleted? Do you really want information about males and females or would data about total persons be sufficient?

White space
If there are many rows of data which are not broken up with white spaces, then you should insert spaces. Put a space after every five or six rows. The table will look less oppressive and it will be easier to read the information correctly across the page without falling onto the wrong row half way along.

Headings and subheadings
Ensure that readers can quickly see where there are headings and subheadings are in the stubs. Make use of indents, italics and bold to help your readers find what they want.
When you checkadd the table, you will identify the subheads. Are there more than two levels in the headings and stubs? The maximum which can be clearly presented is usually three. Perhaps the stubs and headings should be reversed or the table totally redesigned or split up.
Three levels of heading in stubs can be avoided by inserting wafer headings.

Dump items
Is there a stub or column called ‘other’? If this item is large in terms of the total, you should query if it shouldn’t been broken up into more meaningful information.

Conversion: imperial/metric,currencies
Be particularly careful with imperial/metric figures and overseas units of currency. Has the author been confused? Insert a conversion chart or convert the data before publication.

Balance
Does the table look balanced or do the title, headings and footnotes dominate the data which is trying to peek through somewhere in between? In such instances readers will probably find the data, but will ignore the footnotes and will only slowly absorb what is in the title. Tufte (1985) compares this situation to a local council building which has a grand portico at the entrance but where all the work is done in a tiny back room.

Time period
If the data relate to a specific period are there an appropriate number of time periods shown so that trends can be compared? For monthly data, show 15 months; for quarterly, show 6 quarters. These series provide an effective comparison in movements and levels between current periods and the corresponding periods in the previous month or quarter.

Standard error, seasonal adjustment, sampling error
Should these be provided? What explanation of them is required? Have data with a high sampling variability been qualified?


Observe your reactions to tables you find in your daily life. Are you able to quickly assimilate the information in them? Why or why not?


Consistency

Terminology
Check that the same words are used to describe the same items in all tables and in accompanying text. For example, use ‘male’ or ‘female’ each time and change ‘men’ and ‘women’ if they are used. Check that in headings and stubs the items are, for example, all cities or all countries, all common names or all scientific names and not a mixture. If the data have been collected from different sources where different definitions and terminology have been used it is more difficult. In this situation include some text to explain the sources and, particularly, information on whether the data are comparable.

Typefaces
Check consistency in the use of upper and lower case, alignment, typeface, lines, abbreviations, spacing and the use of the comma for thousands in numbers. Check that the time period is always described in the same way. For example, use ‘at 31 March’ or ‘at the end of March’, not a mixture of both.

Footnotes
Check that all tables use the same approach with footnotes. They should all use the same notation; that is, (a), (b) … or (i) (ii) … or symbols (in the correct order). If numbers have been used in mathematical works, ensure that their use will not be confused with superior numbers or indices.
Where possible ensure that each table has its own footnotes, rather than referring readers back to text or another table. They probably won’t bother to find it. If something is so vital that it requires a footnote, then readers must be able to easily find and read this important information.

Identical data
One of the major checks for consistency in a table is that the same data appear in the table and in its accompanying text, charts and graphs. Perhaps the table was created and a summary of its content was written. However, the table may have been revised but the text wasn’t.
Sometimes data in one table are amended because of a late revision, but the same data in another table are not changed. Obviously tables generated from databases are less likely to contain this error, but be careful about making assumptions.
If you have already checkadded the tables, this check will be much easier because you will know where certain items are presented and you can quickly confirm that they are the same. For example, you may find that in one table the data relate to just NSW but in another it may be NSW and ACT. Ensure that there is a reason for this and that the figures contain what they purport to include.

Order in lists
Ensure that stubs or headings are presented in the same order, for example in geographical distribution, a standard classification order or alphabetically. You will also need to assess that the ordering is the most suitable for the reader. In some tables you may list stubs by size because this assists your reader to confirm trends, as well as unexpected results. The problem with listing by size is that if there is more than one table the order might differ for each variable. However, if the reader will appreciate a size order you need to seriously consider it.

Accuracy

Patterns
You should look for patterns. For example, notice how many digits are in each number and see if any look uncomfortable in the row or column. Scanning along rows and down columns for this type of delinquent entry may be all you can do to edit for accuracy.

Progressions of numbers
Look for progressions of numbers and repeated numbers (something typed twice perhaps). It is to be hoped that your proofreader will locate errors of this kind — but you are often typesetter, proofreader and editor.

Breaks in series
If one figure increases, when others in the row decrease, check. Look at changes in trends. One explanation for a sudden change in figures is that a new method of collection has been adopted. Perhaps there has been a new definition of the data item. In this situation you should indicate in some way that figures before and after the event are not comparable. The writer may, or may not, want to describe the actual reasons for the break.
A line across the column at the point of change may be sufficient with a simple explanation that there has been a break in series and figures before and after the break may not be (or are not) comparable. If you have noticed the break your readers certainly will, and they will be making decisions on the basis of the data.

Checkadding
There is one edit you can undertake to check for accuracy. Checkadd tables where at all possible. I realise the magnitude of this task, but I cannot stress too highly how valuable it can be. Tables from a database may not need checkadding, but be certain of this before you decide not to check them.
You should treat checkadding as an integral part of proofreading. It is a simple check on an author’s work and keyboarding accuracy.
When you have to discipline yourself to recalculate data, you quickly grasp what information each table is presenting and how tables relate to each other. This gives you an excellent overview of the topic you are editing. Checkadding tables may be a better first step than editing the accompanying text.

Unit of measurement
Another simple accuracy check is to confirm that the stated unit of measurement really has been used. You may find that the information was collected in kilograms but is published in tonnes or was collected in dollars but published in $’000 or $m.
As you already know, if you are not a subject matter expert you need to at least master some concepts about the topic in front of you. It is really worth taking the time to become familiar with the units of collection used. I have generally found that Standards Australia publications on the international system of units are useful for researching this.
Keep a careful watch for instances where the number of zeros in the units of measurement and rounding have confused authors. They may have misunderstood that the unit is thousands of tonnes and have still entered tonnes.

Decimal point
Make sure the decimal point is in the correct place and that there are leading zeros, that is, 0.84 and not .84.

Data
The hardest check will be on the data itself. Of course you may have an expert on hand. However, you may have to edit this aspect yourself. You will need to draw upon every bit of evidence, every clue, every source and as much common sense and intuition as you can.
For example, it is not difficult to edit a table on monthly chocolate production if you think about climatic and social factors which may be significant influences. Compare data for another time period, location or an associated variable. Look at the trend in the same period last month or quarter. For example, did chocolate production fall such a large amount last February?

Time period
Always check that the stated time period is correct. Perhaps a table was used last year and the data was updated correctly. However, someone could have forgotten to update the date in the title. Consider whether the table relates to a financial year ended June (or another month for some companies or products), a calendar year or just a week, month or quarter. The data might relate to information at a certain day. The table could be average figures for a period or at a certain date. This should be clearly stated.

Negative values
Negative figures should be presented so that their value is clear. For example, they may be preceded by a minus sign or entered in brackets or in italics.
You will discover negative figures when you checkadd.

Confidentiality
Check that the organisation’s rules on confidentiality haven’t been broken. If rules don’t exist consider the implication if, for example, data about one customer or the total number of customers were published or were deducible. Ask whether this is acceptable.

“Greater than or equal to” or “less than or equal to”
Carefully check the use of the symbols greater (or less) than or equal to. These seem to create problems, especially when there is a series of them. The ‘equal to’ part of this symbol is often overlooked and certain values are not included in the equation.

Visual readability

Readability means checking that a table is easy to read and is visually pleasing. This is an important aspect of tables.

The presentation and typesetting of a table is a complete topic on its own and I feel that it would be better looked at from the point of view of designing and creating, rather than editing, a table.

An excellent book on preparing tables is Plain Figures.

References

Chapman, Myra and Mahon, Basil. Plain Figures. HMSO. London. 1986

Reichmann, W J. Use and Abuse of Statistics. Methuen. Great Britain. 1961.

Rude, Carolyn D. Technical Editing, Wadswoth, Inc. USA. 1991.

Tufte, Edmund. Envisioning Information. PA:ISI Press. Philadelphia. 1985.

Irene Wong is Senior Editor with the Australian Securities and Investments Commission. She designed and edited tables during 13 years with the Australian Bureau of Statistics.


Printer-friendly version

Last updated 18 May 1999

Follow me on Twitter  My profile on LinkedIn

Categories

Archives

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: