Software Design and Development

Home > Software Design and Development > Core > Software Development Cycle > Standard Algorithms: File processing

Standard file processing algorithms used in software solutions

This material describes and implements the standard file processing logic required by the syllabus for use in software solutions using pseudocode.

Syllabus outcomes

H4.2 A student applies appropriate development methods to solve software problems.
H4.3 A student applies a modular approach to implement well-structured software solutions and evaluates their effectiveness.
H5.2 A student creates and justifies the need for various types of documentation required for a software solution.
H5.3 A student selects and applies appropriate software to facilitate the design and development of software solutions.

Students must learn to recognise the logic of specified standard approaches, apply those standard approaches as part of the solution to a complex problem and document the logic required to solve a problem using both pseudocode and flowcharts.

It is necessary to understand the way files are organised as sequential or random access files so that the most efficient method can be chosen to store and access data in standard algorithms.

Sequential files

Objective: These files can contain sorted or unsorted data but they must be accessed in the order in which they were written. They are like songs recorded on a cassette tape.

Concept: These files can contain records of fixed or variable length. When writing sequential files of variable length, markers are used to break the data into logical units and to signal the end of the file (EOF). The EOF marker is called the sentinel value. As the programme reads the file it compares each character to the marker characters to control processing.

Algorithm: This algorithm transfers characters from a sequential file containing variable length records (e.g. a word processed document) to an array or records (e.g. a database). The sequential file contains characters with a tab character between each variable length field, a return character between each variable length record and an asterisk as a sentinel value to mark EOF. When all records have been processed the algorithm reports the number of records read.

algorithm

Random access files

Objective: These files are accessed in the order required by the programme, like songs on a CD when you programme the playing order.

Concept: These files contain records of fixed or variable length. When writing records in a random access file it is necessary to know where each record starts. We can do this by either writing an algorithm to calculate this starting address or preparing an index to allow the programme to quickly look up the starting address of each record. Audio CDs call this index a TOC, Table of Contents. Writing an algorithm to calculate the address of a record is called hashing.

Algorithm: This algorithm calculates the address of a record using a 10-digit barcode, read from an index, to generate an address as an integer between 1 and 2025. It achieves this by adding the barcode digits in two sequential groups of 5, then finding the product of these two sums.

algorithm

Activity: File processing

  1. Using the test data below do a desk check of the algorithm TRANSFERDATA. Test data TRANSFERDATA

    index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    value G I R L S TAB R U L E RET O K *  

  2. Write an algorithm to hash to an address between 1 and 2025 using the product of the sum of the odd digits and the sum of the even digits in a 10 digit ID code.

Check your answers

Bibliography

Brookshear, J. Glenn. (1994) Computing Science an overview Fourth Edition, The Benjamin/Cummings Publishing Company.

Stephens, R (1998) Ready-to-Run Visual Basic Algorithms Second Edition,

Wiley Computer Publishing,

Further Resources

Fowler, A (2000) Software Design and Development, First Edition, Heinemann

This work was prepared by

Jennifer Thomson

Go To Top

Neals logo | Copyright | Disclaimer | Contact Us | Help