Thursday, August 18, 2016

Data Management - Part I

Introduction

The past two decades have witnessed enormous growth in the number and importance of
database applications. Databases are used to store manipulate , and retrieve data in nearly
every type organization including business healthcare, education, government, and libraries.
Database technology is routinely used by individuals on personal computers ,By work groups
accessing the database on network servers, and by all employees using enterprise- wide-
distributed distributed application.

Basic Concepts and Definitions

We define database as a memory location where all the related data are stored in an
organized manner. A database may be of any size and complexity. For example a sales
person may maintain a small database of customer contacts on his laptop computer that
consist of a few mega bytes of data. A large corporation may build very large database
consisting several terabytes of data. on a large mainframe computer that is used for
decision support applications.

Definitions of Data

Historically , the term data referred to known as facts that could be recorded and stored on
computer media. For example sales persons database, the data would include facts such as
customer name address and telephone number, This definition now needs to be expanded to
reflect a new reality. Databases today are used to store objects such as documents
,photographs ,sound and even video segments in addition to conventional textual and
numeric data. To reflect this reality we use the following broadened definition : Data
consists of facts ,texts graphics, images sound and video segments that have meaning in the
users environment.

We have defined a database as an organized collection of related data. By organized we
mean that the data are structured so as to be easily stored, manipulated, and retrieved by
users and that the users can use the data to answer questions concerning that domain.

Metadata

As we have indicated data only become useful when placed in context. The primary
mechanism for providing context for data is metadata: Metadata are data that describes the
properties or characteristics of other data. Some of these properties include data definitions,
data structures, and rules or constrains.


File

Firstly it may be an ASCII file, which in turn could imply it to be either an executable
application or just a document. (eg : word.exe or word.doc). The second definition considers a file to be a collection of interrelated records (a table).

Components of a Database


  • Tables (Entities, Relations)
  • Queries
  • Forms
  • Reports
  • Macros


Table: A table is a collection of records about a specific subject. Eg : Customer table, Supplier table,
Product table, etc.

A table contains :

Record (Instance, Row)
Fields (Columns)



A record is an instance of an entity. lt is basically a collection of field values which combine
to make up a single item description within a database table.

Field (Columns) is a property or characteristic of an entity.

Query: A query is a set of conditions, which are placed upon a give data source in order to extract
specific information as desired by the user.

eg : using structured query language (SOL)

SELECT ename
FROM empfile
WHERE Sal < 5000 AND Age > 50
ORDER BY ename ASC

Form: A form is a construct which is used to design the front end (user interface) for a database.

Report: ls a tool for generating presentable output from a database system. Report generators usually accept either tables or query results as input and then arrange these into a presentable manner. The resulting file could either be printed, stored or exported.

Macro: A macro is a pre-coded (in built) set of functions which could be called into an application.

It is basically a ordered set of functions which is to be performed on data items.

Example:

CODE Open datal.table --------> selects input source'
SORT ASC ----------------> 'sort function of a macro'
STORE, CLOSE -----------------> macrol 'store result and close'

Field types (Data types)
A field in a database table needs to be defined prior to storing any data. Each field must have a field name, a field type and a field length. Additionally field descriptions, integrity constraints and default values could also be set.


  • Char/Text (Stores alphanumeric characters)
  • Integer/Number (Stores whole numbers)
  • Real (Stores numbers with decimal places)
  • Binary/Boolean Yes/No (Stores '1 or 0)
  • Date/Time (stored according to a predefined date format such as dd/mm/yyyy)
  • Auto number (Stores an automatically generated primary key)

Methods of data capture (inputting data)

  • Manually Keying in (Key-2-Disk)
  • Scanning (OCR, OMR)
  • Importing from externai sources
  • Turnaround (feedback documents)

Manually keying in (Key-2-Disk System)

The traditional and to this day most common form of data input is based on the key-2-disk
system where data items from physical sources are read by a data entry operator and then
typed into the computer using a standard keyboard.

Scanning supported by character recognition

A newer system which automates the data entry process. lt involves first scanning in
images of physical documents and then using text recognition software to convert the
image into digital text / characters.

Types of scanning-recognition devices

OMR - identifies marks and shadings: Example: You fill an application form one character inside a box such as your name, computer software recognize the letters inside the box or box or circle that is filled.

OCR - identifies textual characters. Example: You scan a document and the software automatically converts its letters to editable format such as in a text editor.

MICR - identifies magnetic ink symbols.

Sources for importing data

  • Floppy disks (diskettes)
  • CD Roms, CD-R, CD-RW
  • Zip disks
  • Internal Databases
  • Local computer networks
  • The Internet

Turnaround documents

A turnaround document is a document which is used to provide automated entry of user responses. It involves first printing out a sheet of questions asking users to either mark or enter details in appropriate blank spaces. Once completed the document is returned to the company which in turn feeds it into the computer system using OCR/OMR scanning devices. The data obtained is then stored within the system.

Processing of data

Processing is the process of taking in raw unprocessed facts and figures, performing required procedures on them and then presenting them in a manner which is meaningful to the user.

Types of processing that could be done on data

  • Querying (extracting information)
  • UPdating (making changcs,l
  • Mathematical and Statistical functions
  • Summarizing
  • Grouping
  • Sorting

No comments:

Post a Comment

Important Notice!

Dear students and friends. When you commenting please do not mention your email address. Because your email address will be publicly available and visible to all. Soon, it will start sending tons of spams because email crawlers can extract your email from feed text.

To contact me directly regarding any inquiry you may send an email to info@bcslectures.website and I will reply accordingly.