Advanced Data Entry

This section describes advanced methods for data entry in Gnumeric. This includes techniques useful when adding large amounts of data, methods to automatically catch mistakes during data entry, using pre-defined templates to format data input, obtaining data from external sources and generating sequences of random numbers with defined distributions.

5.4.1. Entering Large Quantities of Data

It is sometimes necessary to enter large amounts of data by hand into a spreadsheet. To facilitate this work, Gnumeric provides several techniques to facilitate the entry of large amounts of data.

If data are to be entered into a series of rows or columns, this region can be selected ahead of time thereby modifying the behaviour of the data entry keys (the Enter, Tab and arrow keys).

Data entry into a region
  1. Select the region with the mouse. For example, the region from cell C4 to cell E8 can be selected by clicking with the left mouse button on cell C4 and dragging the mouse cursor to cell E8. (More information on complex selections is presented below.)

  2. Enter data by typing the data and the Enter key. If this is done repeatedly, the fifth time the Enter key is pressed, the selection will not move to cell C9 but will jump up to cell D4.

The Tab key can also be used instead of the Enter key to move sequentially through the selection.

5.4.2. Entering a Regular Sequence

It is often necessary to enter a regular sequence of numbers or a repeated sequence of text. Gnumeric provides several ways to input series and sequences of this kind.

The simplest way to fill a series with the same element repeated involves entering the element once and dragging the selection box to fill that element repeatedly. For example, the text "employee:" could be input into cell C2. That cell could then be selected. The selection box is a thick white rectangle which surrounds the cell. This selection box has a small white square at the bottom right hand corner. If the mouse cursor is placed above this square box, it changes to a thin cross. If the left hand mouse button is clicked and held, and the mouse dragged to cell C10, Gnumeric will automatically fill all of the cells with the identical string.

An alternative way to enter data into a region involves first selecting the region, then typing the value and finally typing the Ctrl+Enter key combination. This will fill the whole region with the identical value which was originally entered.

A similar method is available to fill sequences of integers. If the example just given was altered so that cell C2 had the number 14 and the Ctrl key was held during the dragging of the selection, Gnumeric will automatically fill the cells C2 to C10 with the series 14,15,16,...,22.

More complex series and sequences of data can be entered with a similar mechanism.

To do an autofill:

  1. Enter a value into the first cell you wish to autofill. For example, the cell C2 could have the number "24" entered.

  2. Enter a second value into the second cell you wish to autofill. This must be adjacent to the first cell. This sets the increment to use when autofilling the rest of your cells. For example, the cell D2 could have the number "28" entered.

  3. Select both the cells just entered. At the bottom-right of the selection should be a small box. Your mouse cursor will change to a cross-hair when placed over the box. Press and hold on the box. Drag in the direction, either vertical or horizontal, you wish to increment and release when all the cells are filled. For example, selecting cells C2 and D2, then dragging the bottom right of the selection to cell I2 will fill the cells with the sequence from 24 to 48 with each increment being 4.

An alternative to the last step involves using the menus. Once the first two values have been input, the whole range to be filled can be selected using the mouse and then the Autofill selection can be made from the Edit and Fill. This will automatically complete the series in the selected region.

Gnumeric is able to increment several types of data beyond simple integers. The procedure is the same as described above but involves different starting values. Gnumeric can increment:

Integers

1, 2, 3, etc.

Natural Numbers

1.03, 2.05, 3.07, etc.

Weekday Names

Monday, Tuesday, etc.

Weekday Abbreviations

Mon, Tues, etc.

Month Names

January, February, etc.

Month Abbreviations

Jan, Feb, etc.

Strings with Numbers

Item1, Item2, etc

Dates

11/14/2001, 11/15/2001, etc.

Gnumeric supports incrementing the date by month, date, or year.

Note that, While Gnumeric will increment days of the month, if you do 11/14/2001 and 12/14/2001, it will recognize it as the same day of the month and increment the month so the next value would be to 1/14/2002 instead of the day difference.

Gnumeric can be explicitly told the cells to autofill as in the examples above, but it can also guess the number of cells to fill based on the length of an adjacent column or row. For example, if the cells B2 to B10 have information and cell C2 has the integer value "1", then selecting cell C2 and double clicking on the bottom rightmost box of the selection rectangle will fill the value "1" from cell C3 to cell C10.

5.4.3. Automatically Correcting Simple Mistakes

The entry of large amounts of data into a spreadsheet is tedious work which is prone to repeated mistakes. Gnumeric provides a tool to automatically correct commonly made simple mistakes. The corrections are configured and activated using the `AutoCorrect' dialog, available via Auto Correct in the Tools menu.

Figure 5-4The Auto Correct dialog.

5.4.3.1. Capitalize the Names of Days

If this correction rule is activated, the first letter of a name of a day is capitalized automatically. For example, if you type `monday', it is automatically replaced by `Monday'.

5.4.3.2. Correct TWo INitial CApitals

A common mistake is to hold down the shift key a little bit too long while typing initial letters. When it happens, you will get two initial capitals instead of one. If this correction rule is activated, the second letter of words beginning with two capital letters is automatically lowercased. For example, if you type `TOtal' into a cell it is replaced by `Total'. Note that if the word contains two capital letters only, it is not replaced.

It is possible to specify exceptions to this tool. For example, you do not want the tool to replace the word `PVbonds' when it is typed. To specify exceptions, type `PVbonds' into the ``Do not correct'' entry, and press ``Add'' button. Now the word should be included in the list of exceptions. To remove a word from the list, select the word and press the ``Remove'' button.

5.4.3.3. Capitalize the First Letter of Sentences

If this correction rule is activated, the first letter of a sentence typed into a cell is capitalized, if it is a lowercase letter in the first place. Only text that ends to a dot is considered a sentence.

It is possible to specify exceptions to this tool. For example, you do not want the tool to capitalize letters after acronym `i.g.'. To specify exceptions, type `i.g.' into the ``Do not capitalize after'' entry, and press ``Add'' button. Now the word should be included in the list of exceptions. To remove a word from the list, select the word and press the ``Remove'' button.

5.4.4. Using a Format Template

This section has not yet been written.

5.4.5. Generating Random Number Sequences

Figure 5-5Random Number Generation Tool Dialog

Use the random number generation tool to generate random numbers. This tool can generate random numbers from various probability distributions.

Specify the random distribution by selecting one of the items from the random distribution list. The following random distributions are supported: Discrete, Normal, Poisson, Exponential, Binomial, Negative Binomial, Bernoulli, and Uniform.

Specify the parameters of the selected distribution:

Discrete Random Distribution

Specify the value and probability input range in the Value and Probability Input Range: entry box. The value and probability input range is a table consisting of two columns and any number of rows. The first column specifies the discrete random values and the second column the probabilities for them. The discrete random values do not have to be numbers, for example, strings will do as well. The sum of the probabilities in the second column should be one. For example, if you have the values A, B, C, and D in A1:A4 and values 0.1, 0.4, 0.2, and 0.3 in B1:B4, you would specify the value and probability input range to be A1:B4.

If the probabilities do not add to 1, they will be automatically scaled.

Normal Random Distribution

Specify the mean and the standard deviation. The default values are 0 for the mean and 1 for the standard deviation.

Poisson Random Distribution

Specify the lambda in the Lambda entry. Lambda is the average number of events in a unit time interval.

Exponential Random Distribution

Specify b in the b Value entry.

Binomial Random Distribution

Specify the probability of success (p) in the p Value entry and the number of trials (n) in the Number of Trials entry. The Binomial distribution is a discrete distribution in which the experiment consists of n identical trials. Each trial is independent of other the trials and has two possible outcomes, a success or a failure. The probability of success p is constant from one trial to another. The mean of a random variable that has a Binomial distribution is E(X) = np, and the variance is var(X) = np(1-p).

Negative Binomial Distribution

Specify the probability of success p in the p Value entry and the number of failures r in the Number of Failures entry. Negative Binomial distribution is a discrete distribution in which the experiment consists of a sequence of independent trials. Each trial has two possible outcomes, a success or a failure. The probability of success p is constant from one trial to another. The experiment continues until r failures are observed, where r is fixed in advance. The mean of a random variable that has a Negative Binomial distribution is E(X) = r(1-p)/p, and the variance is var(X) = r(1-p)/p^2.

Bernoulli Random Distribution

Specify the probability of success (p) in the p Value entry. p is a probability value between 0 and 1. The Bernoulli distribution has two random values 0 and 1, and p is the probability to observe value 1. The mean of a random variable that has a Bernoulli distribution is E(X) = 1(p) + 0(1-p) = p, and the variance is var(X) = p(1-p).

Uniform Random Distribution

Specify the range of the continuous random variable with the Between: and And: entries. The default values for these entries are 0 and 1.

Specify the number of variables in the Number of Variables: entry on the `Options' Page. This determines the number of columns of random values to be produced.

Specify the number of random numbers for each variable in the Size of Sample: entry on the same page. This determines the number of rows of random values to be produced.

Figure 5-6Some Example Data for the Random Number Generation Tool
Example 5-15Using the Random Number Generation Tool

Figure 5-6 shows some example data and Figure 5-7 the corresponding output.

Figure 5-7Random Number Generation Tool Output