📕
SEIRFX
  • Introduction
  • About These Notes
  • Schedule
  • Unit 2
    • Node
      • Internet Fundamentals
      • Full-Stack Fundamentals
      • Intro to Node
      • Node Modules
      • Node Packages
    • Express
      • Intro to Express
      • Routes
      • Routes Lab
      • Views
      • Templates
      • Layouts & Controllers
    • CRUD & REST
      • GET & POST
      • GET & POST Lab
      • PUT & DELETE
    • API Calls in Express
      • Axios
      • Request (no longer maintained)
    • Sequelize
      • Terminology
      • Setup
      • Using Models
      • Seeding Data
      • Validations and Migrations
      • Resources
      • 1:M Relationships
      • N:M Relationships
    • Express Authentication
      • Research Components
      • Code Components
      • Auth in Theory
        • Sessions
        • Passwords
        • Middleware
        • Hooks
      • Auth in Practice
        • Create the User
        • User Signup
        • Sessions
        • User Login
        • Authorization and Flash messages
  • Development Workflow
    • Command Line
      • The Terminal
      • Filesystem Navigation
      • File Manipulation
      • Additional Topics
    • Intro to Git
      • Version Control
      • Local Git
      • Remote Git
      • Git Recipes
    • Group Collaboration
      • Git Workflows
      • Project Roles and Tools
    • VS Code Tips & Tricks
  • HTML/CSS
    • HTML
    • CSS Selectors
    • CSS Box Model and Positioning
      • Box Model
      • Display and Positioning
      • Flexbox
      • Grid
      • Flexbox & Grid Games
      • Floats and Clears
      • Additional Topics
    • Advanced CSS
      • Responsive Design
      • Pseudo-Classes/Elements
      • Vendor Prefixes
      • Custom Properties
      • Additional Topics
    • Bootstrap
    • CSS Frameworks
    • Accessibility
  • JavaScript
    • Primitives
    • Arrays
    • Objects
      • Objects Lesson
      • Objects quick guide
      • Object-ception
    • Control Flow
      • Boolean Expressions
      • Conditionals
      • Loops
      • Promises
    • Functions
      • Scope
      • Callbacks
      • Higher Order Functions
      • Callbacks Review Lab
      • Timing Functions
      • Iterators
      • Combining Data Types
      • Combining Data Types Lab
    • Javascript in the browser
      • DOM and Events
      • DOM Manipulation
      • DOM Review
      • DOM Review Lab
      • HP DOM Lab
      • Programmatic DOM Manipulation
      • Grids & Pyramids
      • DOM & Data
      • DOM Events
      • Color Palette Picker
      • Sketchpad
    • HTML5 Canvas
    • How To Reduce Redundancy
    • OOP
      • Westworld Lab
      • OOP Factories
      • OOP Inheritance
      • OOP Inheritance Lab
      • Tomagotchi Lab
      • OOP Space Battle
      • OOP Snowman
      • (2019) JavaScript OOP
      • (2016) OOP with Classes
      • (1995) OOP with Prototypes
      • Constructors
      • Prototypes
    • Intro to TDD
    • Scoping
    • Inheritance
      • Prototypal Inheritance
      • Call, Apply, and other Functions
      • ES6 Inheritance
      • Resources
    • Custom Node Modules
    • Additional Topics
      • AJAX, Fetch, and Async/Await
      • AJAX w/JSON and Localstorage
        • AJAX w/JSON
        • Local Storage
      • Async module
      • Data Scraping
  • jQuery
    • Intro
      • DOM Manipulation
      • Reddit Practice
      • Styling
      • Events
    • Plugins
    • AJAX
  • APIs
    • Fetch
    • AJAX w/jQuery
    • AJAX w/Fetch
  • Databases
    • Intro to SQL
    • Advanced SQL
    • MongoDB
      • Intro to NoSQL
      • CRUD in MongoDB
      • Data Modeling
      • Intermediate Mongo
  • Left over Node/Express
    • Testing with Mocha and Chai
    • Mongoose
      • Mongoose Associations
    • JSON Web Tokens
      • Codealong
    • Additional Topics
      • oAuth
      • Geocoding with Mapbox
      • Geocoding and Google Maps
      • Cloudinary
      • Websockets with Socket.io
      • SASS
  • Ruby
    • Intro to Ruby
    • Ruby Exercises
    • Ruby Classes
    • Ruby Testing with Rspec
    • Ruby Inheritance
    • Ruby Data Scraping
  • Ruby on Rails
    • Intro to Rails
    • APIs with Rails
    • Asset Pipeline
    • Rails Auth and 1-M
      • Auth Components
    • Rails N:M
    • ActiveRecord Polymorphism
    • Additional Topics
      • oAuth
      • SASS
      • Rails Mailers
      • Cloudinary
      • Jekyll
  • React (Updated 2019)
    • ES6+/ESNext
      • Const and Let
      • Arrow Functions
      • Object Literals and String Interpolation
      • ES6 Recap
      • ES6 Activity
    • Intro to React
      • Create React App
      • Components and JSX
      • Virtual DOM
      • Props
      • Dino Blog Activity
      • Nested Components
      • Lab: LotR
    • React State
      • Code-Along: Edit Dino Blog
      • Lab: Simple Calc
      • Lifting State
    • React Router
      • Browser History/SPAs
      • React Router (lesson and full codealong)
      • Router Lab
    • Fetch and APIs
      • APIs with Fetch and Axios
      • Fetch the Weather
    • React Hooks
    • React LifeCycle
      • Lab: Component LifeCycle
    • React Deployment
    • Additional Topics
      • React Frameworks
        • Material UI Theming
      • Typescript
        • More Types and Syntax
        • Tsconfig and Declaration Files
        • Generics with Linked List
      • Redux
      • TypeScript
      • Context API
      • React Native
  • Meteor
  • Deployment and Config
    • Installfest
      • Mac OSX
      • Linux
      • Git Configuration
      • Sublime Packages
    • Deploy - Github Pages
    • Deploy - Node/Sequelize
    • Deploy - Node/MongoDB
    • Deploy React
    • Deploy - Rails
      • Foreman (Environment Variables)
    • Deploy - AWS Elastic Beanstalk
    • Deploy - S3 Static Sites
    • Deploy - Django
    • Deploy - Flask
  • Data Structures and Algorithms
    • Recursion
    • Problem Solving - Array Flatten
    • Binary Search
    • Algorithm Complexity
    • Stacks and Queues
    • Bracket Matching
    • Ruby Linked Lists
      • Sample Code
      • Beginner Exercises
      • Advanced Exercises
    • JS Linked Lists
      • Sample Code
      • Beginner Exercises
      • Beginner Solutions
    • Hash Tables
    • Intro to Sorting
    • Insertion Sort
    • Bucket Sort
    • Bubble Sort
    • Merge Sort
    • Quick Sort
    • Heap Sort
    • Sorting Wrapup
    • Hashmaps
    • Trees and Other Topics
  • Python
    • Python Installation
    • Intro to Python
    • Python Lists
    • Python Loops
    • Python Dictionaries
    • Python Sets and Tuples
    • Python Cheatsheet
    • Python Functions
    • Python Classes
    • Python Class Inheritance
    • Intro to Flask
    • Intro to SQLAlchemy
      • Flask and SQLAlchemy
    • Using PyMongo
    • Intro to Django
    • CatCollector CodeAlong
      • URLs, Views, Templates
      • Models, Migrations
      • Model Form CRUD
      • One-to-Many Relations
      • Many-to-Many Relations
      • Django Auth
    • Django Cheatsheet
    • Django Auth
    • Django Polls App Tutorial
    • Django School Tool Tutorial
    • Django 1:M Relationships
    • Custom Admin Views
    • Data Structures and Algorithms
      • Recursion
      • Binary Search
      • Stacks and Queues
      • Linked Lists
      • Binary Trees
      • Bubble Sort
      • TensorFlow & Neural Networks
    • Adjacent Topics
      • Raspberry Pi
      • Scripting
  • Assorted Topics
    • History of Computer Science
    • Regular Expressions
    • Being Successful in SEI
    • Internet Fundamentals
      • Internet Lab
    • Adjacent Workflow
      • UX/UI
      • Wireframing Exercise: Build an Idea
      • Agile
    • Post SEI
      • Learning Resources
      • Deliverables -> Portfolio
      • FAQ
  • Projects
    • Project 1
    • Project 2
    • Project 3
      • Project 3 Pitch Guidelines
    • Project 4
    • Past Projects
      • Project 1
      • Project 2
      • Project 3
      • Project 4
      • Portfolios
    • Post Project 2
    • MEAN Hackathon
      • Part 1: APIs
      • Part 2: Angular
    • Portfolio
  • Web Development Trends
  • Resources
    • APIs and Data
    • Tech Websites
    • PostgreSQL Cheat Sheet
    • Sequelize Cheat Sheet
    • Database Administration
  • Archived Section
    • (Archived) ReactJS
      • Intro to React
        • Todo List Codealong
        • Additional Topics
      • Deploy React
      • React with Gulp and Browserify
        • Setting up Gulp
        • Additional Gulp Tasks
      • React Router
        • OMDB Router
        • OMDB Search
        • Additional Resources
      • React Animations
        • CSS Animations
    • AngularJS
      • Intro to AngularJS
        • Components and SPA
        • Create an Angular App
      • Angular Directives and Filters
      • Angular Animation
      • Angular Bootstrap Directives
        • Bootstrap Modals
      • Angular $http
      • Angular Services
        • Service Recipes
        • ngResource
        • Star Wars Codealong
      • Angular Routing
      • Angular + Express
      • Angular Authentication
        • Additional Topics
      • Angular Components
      • Angular Custom Filters
      • Angular Custom Directives
Powered by GitBook
On this page
  • Objectives
  • Hashmaps - Flexible data structures
  • Data structure basics
  • Bookshelf example
  • What this solves for
  • The hashing function
  • The hash buckets and hash collisions
  • Uses
  • Useful links

Was this helpful?

  1. Data Structures and Algorithms

Hashmaps

Objectives

  • Describe several uses for a hashmap

  • Implement a hashmap is the language of your choice

  • Describe the advantages of using a hashmap vs. other data structures

Hashmaps - Flexible data structures

In JavaScript, everything is a hashmap. MongoDB is essentially a distributed hashmap. Hashmaps are used constantly in computer science, and provide a flexible and fast way of storing and retrieving data. Why is this? To find out, we have to look at how a hashmap is implemented.

Data structure basics

Data structures are just collections of information, and there are four operations that you'll need to do with any data structure:

  • Insertion - The data structure would be useless if you couldn't add things to it.

  • Search - Finding an item when you don't initially know where it is

  • Access - Accessing an item when you know it's location

  • Deletion - You should be able to remove things from a data structure when you need to

In the most basic possible data structure, an unorganized array, insertion is easy as long as you store the length of the array. Just add something to the end. Lookup is hard. In the worst case, you go through the entire array, and only find what you're looking for at the end. Deletion is similarly worse. You might have to go through the entire array, and then if you remove an item, you either leave gaps in your array, which means degradation in performance over time, or you double your workload by resizing the array after you've made the deletion.

Things are slightly better in an ordered array, but the sort step, and the search step still aren't guaranteed to be fast, depending on the size. Hashmaps solve for many of these problems.

Bookshelf example

Think of the process of putting books into a neatly organized bookshelf. The first thing you do is look at the first letter of the author's last name. Then you go directly to the section of the bookshelf that contains authors whose last name begins with that letter. You then figure out where the book belongs within that section, and insert the book.

We can see that this example has a few main components:

  • Finding the letter of the author's last name - In the context of a HashMap, this is known as our hashing function

  • Having sections for each letter of the alphabet, in which you'll put authors - These are our hash buckets. Finding the right hash bucket can be done in constant time

  • Figuring out where the book belongs in that section - This is known as a hash collision. (presuming we have any other authors whose last name begins with that letter!)

What this solves for

Insertion, deletion, and lookup can be very fast. Almost constant, depending on how you handle hash collisions. Moreover, they remain so, even when the hashmap scales.

The hashing function

If we ignore case, we can assume that our bookshelf has 26 hash buckets. Our hashing function is just to take the last letter of the author's last name, figure out which letter of the alphabet it is, and go directly to that bucket.

If we had to use fewer hash buckets, we could modulo the number of the last letter by the number of buckets we have, and use that.

If our hashing function returned an arbitrarily high number, we'd still just mod it by the number of buckets.

The hash function can make the entire hashmap perform better or worse. Luckily, most languages have hash function implementations that are pretty good, so you don't have to worry about it.

Abstractly here, what we're doing, is converting a key into a number. We have algorithms that can do this with arbitrary strings. Meaning, we can take a random string and turn it into a unique number, which we can then use to find an appropriate hash bucket. With random objects, we could turn them into numbers in many ways, from finding the memory addres they're stored in, to figuring out how many bytes of storage they take up, to any combination of things. You're limited only by your creativity.

The hash buckets and hash collisions

We could use as many different data structures for our hash buckets as we'd like. We could even use another hashmap for it, as long as we used a different hashing function, and we eventually went to a different data structure. Linked lists have constant insertion times, and linear lookup and deletion times. Ordered arrays mean you can use binary searches, which have O(log(n)) time for everything.

The number of hash buckets we use also have an effect on the performance of our hashmap. Too few buckets, and we have a lot of hash collisions, which degrades performance. Too many, and we use more memory than we need to.

Uses

MongoDB is essentially a distributed hashmap. Notice how every item you insert gets a big ugly string as an ID? That's basically a big number that's generated by it's internal hashing function. The advantage is that we can split our hash buckets over a number of servers to distribute load more evenly, and make our system more fault-tolerant.

Everything is a hashmap in JavaScript. This makes it easy to add arbitrary values to objects without worrying about organization.

Useful links

PreviousSorting WrapupNextTrees and Other Topics

Last updated 4 years ago

Was this helpful?

Basic Data Structures: Hash Tables
Wikipedia's entry on Hash Tables
Stack Overflow question on Hash Tables
Generate a hash from a string in JavaScript