2016-03-22-data-structure-basics-1.md 5.1 KB


title: Data Structure Basics pt.1 description: > Introductory notes on the basics of data structures. categories: posts

tags: [data-structures, ruby]

Table of Contents
  • TOC {:toc}

What is a Data Structure?

As applications get more complex, and data grows richer, we face three common problems:

  • Data search becomes slower.
  • Processor speed becomes irrelevant.
  • Need to serve multiple requests.

Data structures help by keeping data organized efficiently.

At its core, a data structure is an efficient data arrangement. They are everywhere in our daily life. We simply think of them as collections of information. For instance, a shopping list, a directory, a dictionary, a flight schedule.

Since raw data usually fits a data structure we need to study at least the most recurrent ones. Before learning a bit about their characteristics lets see what implementation and interface are, though.

Depending on the context, implementation refers to either the code representation of a data structure. Or the definition of the algorithms used on a given data structure.

An interface represents the set of operations that a data structure supports. It may include the type of parameters these operations take, as well as their return type.

Basic Concepts

Since data structures are, by definition, data centered lets clarify some concepts which will make further explanations brief yet loaded with content.

Data Definition

We define data according to its characteristics. A good definition should be:

  • Atomic: it defines a simple concept whose decomposition serves no purpose.
  • Traceable: it should be able to be mapped to other data elements.
  • Accurate: unambiguous.
  • Clear: understandable.
  • Concise: centered on the data.

Data Object & Data Type

Data objects are mere representations of objects that have data. Data types refer to different ways to classify data objects according to its data and operations it can handle.

Even though there are many data types they are broadly divided into:

  • Primitive types, such as Integers, Booleans, Strings, and so on.
  • Composite types, such as structs.
  • Abstract data types. This includes most data structures.

From now on we'll focus mostly on abstract data types.

Characteristics

When designing data structures we should keep in mind its:

  • Correctness. It implements the right/expected interface.
  • Time Complexity. The time it takes to execute each behavior should be small.
  • Space Complexity. Memory usage per executed behavior should be small, too.

Common Interface

Most data structures implement these 5 fundamental behaviors:

  • access (one item/all items)
  • insert (at end/at position)
  • delete (from end/from position)
  • find (if exists/at what location)
  • sort (sort in place/create sorted version)

Although we shouldn't be taken by surprise if they don't.

Execution Time Cases

Since speed is part of what makes data structures efficient they don't persist. That is, they are created and hold in memory. We consider three cases when comparing data structures for speed:

  • Worst case. The maximum amount of time it takes a behavior to execute.
  • Average case. The average execution time per behavior.
  • Best case. The least possible execution time per behavior.

Now its time for some simple data structures.

Simple Data Structures

Most languages don't implement these data structures, not because they are not useful but because we only need to collect data objects to implement them; no behavior.

Records

A record results from intentionally grouping some values together. Hence, a record is a container of values, also known as, fields. Typically, a record is defined by having a fix number of fields. Those fields, always appear in the same order. Notice the lack of name.

Structs

We use structs whenever we just want to group some data objects together and give that group a name. Structs differentiate from classes:

Struct Class
only data - no behavior behavior and data
simple creation explicit instantiation (new, alloc)
value types reference types
no OO features Polymorphism, etc.
"Plain Old Data Structure"
(PODS)

New developers should distinguish between data structures' naming conventions and their language's conventions. Since Ruby is my native computer language I will point some of these differences when appropriate. For further details, check Ruby's documentation.

Structs in Ruby

Ruby has structs, though they are implemented as lightweight classes. This leaves the 'only data - no behavior' rule up to the developer.

Before We Go

We covered lots of key concepts. From why we need data structures, and are in fact, already using them in our daily life, all the way to their key characteristics.

We also cover the two must basic data structures; records and structs. Finally, we talked about how Ruby structs differentiate from their data structure counter part.

That's it for this post. Next one{:rel="noopener"} will cover the basics on arrays and sorting algorithms.