Data are pieces of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data (plural of "datum", which is seldom used) are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived.
Data manipulation is the presentation of scientific data in a misleading way to support a hypothesis that is actually without merit. Informally called "fudging the data," this practice includes selective reporting (see also publication bias) and even simply making up false data.
Data Hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves fields, records, files and so on.
A data field holds a single fact. Consider a date field, e.g. "September 19, 2004". This can be treated as a single date field (eg birthdate), or 3 fields, namely, month, day of month and year.
A record is a collection of related fields. An Employee record may contain a name field(s), address fields, birthdate field and so on.
A file is a collection of related records. If there are 100 employees, then each employee would have a record (e.g. called Employee Personal Details record) and the collection of 100 such records would constitute a file (in this case, called Employee Personal Details file).
Files are integrated into a database. This is done using a Database Management System. If there are other facets of employee data that we wish to capture, then other files such as Employee Training History file and Employee Work History file could be created as well.
The above is a view of data seen by a computer user.
The above structure can be seen in the hierarchical model, which is one way to organize data in a database.
Data collection is a term used to describe a process of preparing and collecting data - for example as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important issues, to pass information on to others. Primarily, data is collected to provide information regarding a specific topic [1].
Data collection usually takes place early on in an improvement project, and is often formalised through a data collection Plan [2] which often contains the following activity.
- Pre collection activity – Agree goals, target data, definitions, methods
- Collection – data collection
- Present Findings – usually involves some form of sorting [3] analysis and/or presentation.
Sorting is any process of arranging items in some sequence and/or in different sets, and accordingly, it has two common, yet distinct meanings:
- ordering: arranging items of the same kind, class, nature, etc. in some ordered sequence,
- categorizing: grouping and labeling items with similar properties together (by sorts).
All the above data is from Wikipedia
Data Filtering from MS Excel 2007
Filtering for unique values and removing duplicate values are two closely related tasks because the displayed results are the same — a list of unique values. The difference, however, is important: When you filter for unique values, you temporarily hide duplicate values, but when you remove duplicate values, you permanently delete duplicate values.
A duplicate value is one where all values in the row are an exact match of all the values in another row. Duplicate values are determined by the value displayed in the cell and not necessarily the value stored in the cell. For example, if you have the same date value in different cells, one formatted as "3/8/2006" and the other as "Mar 8, 2006", the values are unique.
It's a good idea to filter for or conditionally format unique values first to confirm that the results are what you want before removing duplicate values.
Sources:
1. Wikipedia.com, August 18, 2009
2. Microsoft Excel 2007 Help