Learning
  • Software Engineering Golden Treasury
  • Trail Map
  • Caching
    • Alternatives to use before using cache
    • Caching Architecture
    • Cache Invalidation and Eviction
    • Cache Patterns
    • Cache
    • Consistency
    • Distributed Caching
    • Issues with caching
    • Types of caches
  • Career
    • algo types
    • Backend Knowledge
    • Burnout
    • consultancy
    • dev-level
    • Enterprise Developer
    • how-to-get-in-tech-from-other-job
    • how-to-get-into-junior-dev-position
    • induction
    • Interview
    • junior
    • mid
    • New Job
    • paths
    • Principle/staff Engineer
    • Requirements for job
    • Senior Dev capabilities
    • learning
      • automating-beginner
      • company1
        • analyst-progression
        • core-eng-progression
        • dev-progression
        • perf-eng-progression
        • soft-deliv-progression
    • mentoring
      • mentor-resources
    • recruitment
      • questions
      • Spotting posers
  • Computer Science
    • boolean-algebra
    • Compiler
    • Finite State Machine
    • Hashing
    • Algorithms
      • Breadth Firth Search
      • complexity
      • Depth First Search
      • efficiency
      • Sliding Window
      • sorting
    • data-structures
      • AVL Trees
      • data-structures
      • Linked List
    • machines
      • Intel Machine
      • Turing Machine
      • von neumann machine
      • Zeus Machine
  • devops
    • The 5 Ideals
    • microservice
    • Artifact repository
    • Bugs and Fixes
    • Build police
    • cloud-servers
    • Deployments
    • Environments
    • GitOps
    • handling-releases
    • infrastructure-as-code
    • System Migrations
    • SDP
    • On Premises Hosting
    • Properties/configuration
    • Release process
    • Release
    • Roll Outs
    • serverless
    • Serverless
    • Cloud Services
    • Versioning
    • AWS
      • deploy-docker-esc
      • cloud-practitiioner-essentials-notes
        • Module 1 - Intro to AWS
        • Module 2 Compute in the cloud
        • Module 3 Global Infrastructure and Reliability
        • Module 4 Networking
        • Module 5 Storage and Databases
        • Security
        • 7 Monitoring and Aanlytics
        • 8 Pricing and Support
        • 9 Migration and Innovation
      • developer-associate
        • AWS Elastic Beanstalk
    • build-tools
      • Managing dependecies
      • Apache ANT
      • Gradle
        • Custom Plugins
        • local-jars
      • Project Management - maven
        • Archtypes
        • Build Lifecycles
        • Customising build lifecycle
        • Dependencies
        • Directory layout
        • jar-files
        • one-to-one
        • Modules
        • Phases
        • Maven Plugins
        • POM
        • profiles
        • setup
        • Starting a maven project
        • wrapper
    • CI/CD
      • Continuous Delivery
      • zookeeper
      • Continuous Integration (CI)
      • github-actions
      • Pipeline
      • Teamcity
    • Cloud computing
      • Overview
      • Service Models
      • Cloud Services
    • containers
      • Best Practices
      • Docker
    • Infrastructure
      • IT Infrastructure Model
      • Non functional Attributes (Quality Attributes)
        • Infrastructure Availability
        • Performance
        • Secruity
    • monitoring
      • Alerting
      • Monitoring & Metrics
      • Metrics
      • Ready pages
      • Splunk
      • Status pages
      • notes-devops-talk
      • logging
        • logging
        • issues
        • Logging
        • Logging
    • Service mesh
      • Service Discovery
      • Istio
    • Terraform
    • container-management
      • Kubernetes
        • commands-glossary
        • OLTP
        • config-maps
        • Links
        • ingress
        • SDP
        • minikube
        • filter
        • indexes
        • sidecar
        • continuous-deployment
  • General Paradigms
    • CAP theorem
    • designing data-intensive applications summary
    • a-philosophy-of-software-design-notes
    • Aspect oriented Programming (AOP)
    • Best Practice
    • Cargo Cult
    • Clean Code
    • Coding reflections
    • Cognitive Complexity
    • Complexity
    • Conventions
    • Design discussions
    • Design
    • Error Handling Checklist
    • Exceptions
    • Feature Flags/toggle
    • Functional requirements
    • Last Responsible Moment
    • Lock In
    • Named Arguments
    • Naming
    • Performance Fallacy
    • Quality
    • Redesign of a system
    • Resuse vs Decoupling
    • Rules for software designs
    • Sad Paths
    • Scaling Webservices
    • Scientific Method
    • stream-processing
    • Upstream and Downstream
    • Patterns
      • Client-SDK-Pattern
      • ORM
      • Api gateway
      • Business Rules Engine
      • cache
      • Composition Root
      • Dependency Injection Containers
      • Dependency Injections
      • Double Dispatch
      • Exception Handling
      • Gateway pattern
      • Humble Object
      • Inheritance for reuse
      • Null Object Pattern
      • Object Mother
      • Patterns
      • Collection pipeline pattern
      • Service Locator
      • Setter constructor
      • Static factory method
      • Step Builder Pattern
      • telescopic constructors
      • Toggles
      • API
        • Aims of API designs
        • Avoid Checked Exceptions
        • Avoid returning nulls
        • Be defensive with your data
        • convience-methods
        • Fluent Interfaces
        • Loan Pattern
        • prefer-enums-to-boolean-returns
        • return-meaningful-types
        • Small intefaces
        • Support Lambdas
        • Weakest type
      • Gang of Four
        • Builder
        • Factory Pattern
        • Strategy Pattern
        • Template
        • abstract Factory
        • Adapter
        • Bridge Pattern
        • Chain of responsibility
        • Command Pattern
        • Composite Design Pattern
        • Decorator Pattern
        • Facade Pattern
        • Flyweight pattern
        • Guard Clause
        • Interpreter
        • html
        • Mediator Pattern
        • Memento Pattern
        • Observer
        • Prototype
        • Proxy
        • Singleton
        • State Pattern
        • Visitor Pattern
    • Architecture
      • Entity Component System
      • Integration Operation Segregation Principle
      • Adaptable Architecture
      • Architecture
      • C4 Modelling
      • cell-based
      • Clean/Hexagonal Architecture
      • Codifying architecture
      • Correct By configuration
      • Cost Base Architecture
      • Data Oriented Design
      • deliberate
      • Domain oriented DOMA
      • Event Driven Architecture
      • Evolutionary Architecture
      • examples
      • Feature Architecture
      • Framework and Libraries
      • functional-core-imperative-shell
      • Layered Architecture
      • Micro services
      • monoliths-to-services
      • Multi tiered Architecture
      • Multi tenant application
      • Resilient Architecture
      • stage event driven architecture (SEDA)
      • links spring rest app
      • Tomato Architecture
      • Tooling
      • Types of architecture
      • checklist
        • Checklist for new project
        • Back end Architecture Checklist
        • Front end Architecture Checklist
        • Mobile Architecture Checklist
      • Cloud Patterns
        • Command and Query Responsibility Segregation (CQRS)
        • Event Sourcing & CQRS
        • Asynchronous Request and Reply
        • Circuit Breaker
        • Retry
        • Sidecar
        • Strangler pattern
      • Domain driven design
        • value & entity
      • Microservices
        • Alternatives to choosing microservices first when scaling
        • Consistency in distributed systems
        • 12 Factor applications
      • Modularity
        • Module monolith vs Microservices
        • Spring Moduilth
      • Architecture Patterns
        • Hexagonal architecture
        • Inverting dependencies
        • Layering & Dependency Inversion Principle
        • Mappings
        • Vertical Slice architecture
        • Web Client Server
        • domain
          • Business and Data Layers Separation
          • DTO
          • Domain Model Pattern
          • Domain Object
          • Transaction Script/ Use Case pattern
        • Enterprise Patterns
          • Concurrency
          • Distribution strategies
          • Domain layer patterns
          • Layering/organisation of code
          • Mapping to datasource
          • Session State
        • Usecases
          • Use case return types
      • Serverless
        • Knative
    • Design architecture aims
      • back of envelope
      • Design ideas
      • Design mistakes
      • high-volume-design
      • ISO Quality Attributes
      • Non functional requirements
      • “Designing for Performance” by Martin Thompson
      • High Performance
      • Qaulity Attributes
        • Availability
        • System Availability
        • Fault Tolerance
        • interoperability
        • Latency
        • Maintability
        • Modifiability
        • Performance
        • Readability
        • Reliability
        • Scalability vs performance
        • Scalability
        • Scaling
        • statelessness
        • Testability
        • Throughput
      • System Design
      • web-scalability-distributed-arch
        • scalable-and-distributed-web-architecture
    • README
      • Conflict-free Replicated Data Type
      • Fallacies
      • Load balancing
      • Rate Limiting
      • Transactions
    • Patterns of Enterprise Application Architecture
      • Repository Pattern
      • Rules Engines
      • scatter-gather
      • Specification Design Pattern
      • Table Driven Development
      • Workflow Design Patterns
        • Triggers
    • Principles
      • Do It Or Get Bitten In The End
      • Dont Repeat Yourself
      • Habitability
      • Keep it simple
      • Responsibility Driven Design
      • Ya Ain’t Gonna Need It
      • Conceptual Overhead
      • CUPID
      • Reuse existing interfaces
      • Facts and Fallacies
      • locality of behaviour
      • Separation of Concerns
      • Simplicity
      • SLAP principle
      • Step down rule
      • Unix Philosophy
      • Wrong abstractions
      • SOLID
        • 1. Single Responsibility Principle
        • 2. Open Close Principle
        • 3. Liskov Substitution Principle
        • 4. Interface Segregation Principle
        • 5. Dependency Inversion Principle
        • GRASP (General Responsibility Assignment Software Principles)
        • Solid for packages
          • jobs
          • CCP
          • CRP
          • REP
          • egress
          • gossip-protocol
        • STUPID
    • programming-types
      • Coding to Contract/Interface
      • Links
      • Declarative vs Imperative Programming Languages
      • defensive-programming
      • Design by contract
      • Domain Specific Languages (DSL)
      • Event Driven
      • file-transfers
      • Logical Programming
      • Mutability
      • Self Healing
      • Simplicity
      • Type Driven Design
      • Value objects
      • Aspect Oriented Programming
      • Concurrent and Parallel Programming
        • Actor Model
        • Asynchronous and Synchronous Programming
        • Batch processing
        • Concurrency Models
        • SAP
        • Multithreading
        • Non Blocking IO
        • Optimistic vs Pessimistic Concurrency
        • Thread per connection or request model
        • Actor
        • aysnchronous-tasks
          • Computational Graphs
          • Divide and conquer
          • Future
          • Thread Pool
        • barriers
          • Barriers
          • Race conditions
        • design
          • agglomeration
          • Communication
          • Mapping
          • Partitioning
        • Liveness
          • Abandoned Lock
          • Deadlocks
          • Livelock
          • Starvation
        • locks
          • Read write lock
          • Reentrant lock
          • Try Lock
        • Mutual Exclusion
          • Data Races
          • Mutual Exclusion AKA Locks
        • performance
          • Amdahl's Law
          • Latency, throughput & speed
          • Measure Speed up
        • synchronization
          • Condition variable
          • producer consumer pattern
          • Semaphore
        • Threads and processes
          • Concurrent and parallel programming
          • Daemon Thread
          • Execution Scheduling
          • sequential-parallel
          • Thread Lifecycle
          • threads-and-processes
      • Functional Programming
        • Currying
        • design-patterns-to-func
        • imperative-programming
        • First class functions
        • Functional Looping
        • Higher Order Functions
        • Immutability
        • Issues with functional Programming
        • Lambda calculus
        • Lazy & Eager
        • map
        • Monad
        • Railway Programming
        • Recursion
        • Reduce
        • referential-transparacy
        • Referential transparency
        • Supplier
      • oop-design
        • Issues with object oriented code
        • Aggregation
        • Anti Patterns
        • Association
        • class-and-objects
        • Composition
        • general-laws-of-programming
        • general-notes
        • Getters and Setters
        • Inside out programming
        • Inversion of control
        • oop-design
        • Other principles
        • Outside in programming
        • Readability
        • Why OO is bad
        • README
          • abstraction
          • encapsulation
          • inheritance
          • Polymorphism
        • clean-code
          • Code Smells
          • Comments
          • Naming
          • CLEAN design
            • code is assertive
            • Cohesion
            • Connascence
            • Coupling
            • Encapsulation
            • Loose Coupling
            • Nonredundant code
      • Reactive Programming
        • reactive-programming
    • Projects and Software types
      • Applicatoin Development
      • Buying or creating software
      • Console Applications
      • Embedded Software development
      • Enterprise
      • Framework Development
      • Games
      • Library development
      • Rewriting
      • White Label Apps
    • State Machines
      • Spring State Machine
  • Other
    • 10x devs
    • Aim of software
    • Choosing Technologies
    • Coding faster
    • Component ownership
    • developer-pain-points
    • Developer Types
    • Effective Software design
    • Full Stack Developer
    • Good coder
    • Issues with Software Engineering and Engineers
    • Learning
    • Logic
    • Role
    • Software Actions
    • Software craftmanship
    • Software Designed
    • Software Engineering
    • Software
    • article-summaries
      • General notes
      • Summary of The Grug Brained Developer A layman's guide to thinking like the self-aware smol brained
      • improve-backend-engineer
      • Optimising Api
      • Simple and Easy
    • README
  • Hardware
    • Cpu memory
    • Storage
  • Integration
    • GRPC
    • API
    • Apis and communications between apps
    • asynchronous and synchronous communications
    • Batch Processing
    • Communications between apps
    • Delivery
    • Distributed Computing
    • Entry point
    • Event Source
    • SDP
    • egress
    • Graphql
    • Idempotency
    • Libraries
    • Long Polling
    • Multiplexing & Demultiplexing
    • Publish Subscribe
    • Push
    • Request & Response
    • REST
    • Remote Method Invocation
    • Remote Procedure Calls
    • Server Sent Events
    • Short Polling
    • Sidecars
    • SOAP
    • Stateless and Stateful
    • Streams
    • Third Party Integrations
    • wdsl
    • Web Services
    • Webhooks
    • repository
    • Kafka
      • Kafka Streams
    • message-queues
      • ActiveMQ
      • Dead Letter Queue
      • JMS
      • Messaging
  • Languages
    • C
    • Choosing A Language
    • cobol
    • Composite Data Types
    • creating
    • Date time
    • Numbers
    • Pass by value vs Pass by reference
    • Primitive Data Types
    • REST anti-patterns
    • Rust
    • Scripting
    • Static typing
    • string
    • Task Oriented Language
    • assembly
    • Getting started
      • Functional Concepts
    • cpp
    • Java
      • Code style
      • Garbage Collection
      • Intellij Debugging
      • Artifacts, Jars
      • Java internals
      • Java resources
      • Java versions
      • JShell
      • Libraries
      • opinionated-guide
      • Starting java
      • Java Tools
      • Why use java
      • Advanced Java
        • Annotations
        • API
        • Database and java
        • Debugging Performance
        • Files IO
        • Finalize
        • JDBC
        • jni
        • Libraries
        • Logging
        • SAP
        • Memory Management
        • Modules
        • OTher
        • Packaging Application
        • Pattern matching
        • performance
        • Properties
        • Reference
        • reflection
        • Scaling
        • Scheduling
        • secruity
        • Serilization
        • Time in Java
        • validation
        • Vector
        • Concurrency and Multithreaading
          • Akka
          • ExecutorCompletionService
          • Asynchronous Programming
          • Concurrency and Threads
          • CountDownLatch
          • Conccurrent Data Structures
          • Executor Service
          • Futures
          • reactive
          • Semaphore
          • structured concurrency
          • Threadlocal
          • Threads
          • Virtual Threads
          • Mutual Exclusion
            • Atomic
            • Synchronized
            • Thread safe class
            • Threads
        • debug
          • heap-dumps
          • thread-dumps
        • functional
          • Collectors
          • Exception Handling
          • Flatmap
          • Functional Programming
          • Generators
          • Immutability
          • issues
          • Optional
          • Parallel Streams
          • Reduce
        • networks
          • HTTP client
          • servlet-webcontainers
          • sockets
          • ssl-tls-https
      • Basics of java
        • compilation
        • computation
        • Conditonal/Flow control
        • Excuting code
        • Instructions
        • Looping/Iterating
        • memory-types-variables
        • methods
        • Printing to screen/debugging
        • Setup the system
        • Data structures
          • Arrays
          • Arayslist/list
          • Map
      • Effective Java notes
        • Creating and Destroying Objects
        • Methods Common to All Objects
        • best-practice-api
        • Classes and Interfaces
        • Enums and Annotations
        • Generics
      • framework
        • aop
        • bad
        • Dagger
        • Databases
        • Lombok
        • Mapstruct
        • netty
        • resliance4j
        • RxJava
        • Vert.x
        • Spring
          • Spring Data Repositories
          • actuator
          • cloud-native
          • H2 Db in Spring
          • Initializrs
          • JDBC Template
          • Java Persistence API (JPA)
          • kotlin
          • Pitfalls and advice
          • PRoxies
          • Reactive
          • spring security
          • spring-aop
          • Spring Boot
          • spring-jdbc
          • Spring MVC
          • Spring Testing
          • Testing
          • Transaction
          • patterns
            • Component Scan Patterns
            • Concurrency
            • Decorator Pattern in Spring
        • Micronaut
          • DI
        • Quarkus
          • database
          • Links
      • Intermediate level java
        • String Class
        • Assertions
        • Casting
        • Clonable
        • Command line arguments
        • Common Libraries/classes
        • Comparators
        • Where to store them?
        • Shallow and Deep Copy
        • Date and Time
        • Enums
        • Equals and Hashcode
        • Equals and hashcode
        • Exceptions
        • Final
        • Finally
        • Generics
        • incrementors
        • Null
        • packages and imports
        • Random numbers
        • Regex
        • Static
        • toString()
        • OOP
          • Accessors
          • Classes
          • Object Oriented Programming
          • Constructors
          • Fields/state
          • Inheritence
          • Interfaces
          • Methods/behaviour
          • Nested Classes
          • Objects
          • Static VS Instance
          • Whether to use a dependency or static method?
        • Other Collections
          • Other Collections
          • Arraylist vs Linkedlist
          • LinkedHashMap
          • Linked List
          • Priority queue
          • Sequenced Collections
          • Set
          • Shallow vs Deep Copy
          • Time Complexity of Collections
          • What Collection To use?
    • kotlin
      • Domain Specific Language
      • learning
      • Libraries
      • Personal Roadmap
      • Links
    • Nodejs
      • Performance
  • Management & Workflow
    • Agile
    • Take Breaks
    • # Communication
    • Engineering Daybook
    • Estimates
    • Feedback Loops
    • Little's law
    • Managing Others
    • poser.
    • Presentations
    • self-improvement
    • software-teams
    • Task List
    • trade-off
    • Types of devs
    • Type of work
    • Waterfall Methodology
    • coding-process
      • Bugs
      • Code Review
      • Code Reviews
      • Documentation
      • Done
      • Handover
      • Mob Programming
      • Navigate codebase
      • Pair Programming
      • Pull Requests
      • How to do a story
      • Story to code
      • Trunk based development
      • Xtreme Programming (XP)
      • debugging
        • 9 Rules of Thumb of Dubugging
        • Debugging
        • using-debugger
      • Legacy code
        • Legacy crisis
        • Working with legacy code
    • Managing work
      • Theory of constraints
      • Distributed Teams
      • estimations
      • Improving team's output
      • Kanban
      • Kick offs
      • Retrospectives
      • Scrum
      • Sign offs
      • Stand ups
      • Time bombs
      • Project management triangle
    • Notion
    • recruitment
      • In Person Test
      • Interviews
      • Unattended test
  • Networks
    • Content Delivery Network - CDN
    • DNS
    • cache control
    • Cookies and Sessions
    • Docker Networking
    • Duplex
    • Etags
    • HTTP Cache
    • HTTP - Hyper Text Transfer Protocol
    • HTTP/2
    • Http 3
    • Internet & Web
    • iptables
    • Keep alive
    • Leader Election
    • Load balancer
    • long-polling
    • Network Access Control
    • Network Address Translation (NAT)
    • Network Layers
    • Nginx
    • OSI network model
    • Persistent Connection
    • Polling
    • Proxy
    • Quic
    • reverse-proxy
    • servers
    • Server sent events (SSE)
    • SSH
    • Streaming
    • Timeouts
    • Url Encoding
    • Web sockets
    • WebRTC (Web Real-Time Communication)
    • Wireshark
    • tcp/ip
      • Congestion
      • IP - Internet Protocol
      • TCP - Transmission Control Protocol
  • Operating Systems
    • Cloud Computing
    • Distributed File Systems
    • Distributed Shared Memory
    • Input/Output Management
    • Inter-Process Communication
    • Threads and Concurrency
    • Virtualization
    • Searching using CLI
    • Bash and scripting
    • Booting of linux
    • makefile
    • Memory Management
    • Processes and Process Management
    • Scheduling
    • Scripting
    • Links
    • Ubuntu
    • Unix File System
    • User groups
    • Linux
  • Other Topics
    • Finite state machine
    • Floating point
    • Googling
    • Setup
    • Unicode
    • Machine Learning
      • Artificial Intelligence
      • Jupyter Notebook
    • Blockchain
    • Front End
      • Single Page App
      • cqrs
      • css
      • Debounce
      • Dom, Virtual Dom
      • ADP
      • htmx
      • Island Architecture
      • Why use?
      • Java and front end tech
      • mermaidjs
      • Next JS
      • javascript
        • Debounce
        • design
        • Event loop
        • testing
        • Typescript
        • react
          • Design
          • learning
          • performance
          • React JS
          • testing
      • performance
      • Static website
    • jobs
      • Tooling
      • bash text editor - vim
      • VS code
      • scaling
        • AI Assistant
        • Debugging
        • General features and tips and tricks
        • IDE - Intellij
        • Plugins
        • Spring usage
  • persistance
    • ACID - Atomicity, Consistency, Isolation, Durability
    • BASE - Basic Availability, Soft state, Eventual Consistency
    • Buffer
    • Connection pooling
    • service
    • Database Migrations - flywaydb
    • Databases
    • Eventual Consistency
    • GraphQL
    • IDs
    • indexing
    • MongoDB
    • Normalisation
    • ORacle sql
    • Partitioning
    • patterns
    • PL SQL
    • Replication and Sharding
    • Repository pattern
    • Sharding
    • Snapshot
    • Strong Consistency
    • links
    • Files
      • Areas to think of
    • hibernate
      • ORM-hibernate
    • Indexes
      • Elastisearch
    • relationships
      • many-to-many
      • SDP
      • serverless
      • x-to-x-relationships
    • sql
      • Group by
      • indexes
      • Joins
      • Common mistakes
      • operators
      • performance
    • types
      • maven-commands-on-intellij
      • in-memory-database-h2
      • Key value database/store
      • Mongo DB
      • NoSQL Databases
      • Relational Database
      • Relational Vs Document Databases
  • Security
    • OAuth
    • API Keys
    • Certificates and JKS
    • Cluster Secruity
    • Communication Between Two Applications via TLS
    • Cookies & Sessions
    • CORS - Cross-Origin Resource Sharing
    • csrf
    • Encryption and Decryption
    • Endpoint Protection
    • JWT
    • language-specific
    • OpenID
    • OWASP
    • Secrets
    • Secruity
    • Servlet authentication and Authorization
    • vault
  • Testing, Maintainablity & Debugging
    • Service-virtualization and api mocking
    • a-test-bk
    • Build Monitor
    • Builds
    • Code coverage
    • consumer-driven contract testing
    • Fixity
    • Living Documentation
    • Mocks, Stubs & Doubles
    • patterns
    • Quality Engineering
    • Reading and working with legacy code
    • Reading
    • remote-debug-intellij
    • simulator
    • Technical Debt
    • Technical Waste
    • Test cases
    • Test Data Builders
    • Test Pyramids
    • Test Types
    • Testing Good Practice
    • Testing
    • What to prime
    • What to test
    • Debugging
      • Debugging in kubernetes or Docker
    • fixing
      • How to Deal with I/O Expense
      • How to Manage Memory
      • How to Optimize Loops
      • How to Fix Performance Problems
    • Legacy Code
      • Learning
      • Legacy code
      • techniques
    • libraries
      • assertj
      • Data Faker
      • Junit
      • mockito
      • Test Containers
      • Wiremock
      • Yatspec
    • Refactoring
      • Code Smells
      • refactoring-types
      • Refactoring
      • Technical Debt
      • pyramid-of-refactoring
        • Pyramid of Refactoring
    • Test first strategies
      • Acceptance Testing Driven Developement (ATDD)
      • Behaviour Driven Development/Design - BDD
      • Inside out
      • Outside in
      • Test driven development (TDD)
    • testing
      • Acceptance tests
      • How Much Testing is Enough?
      • Approval Testing
      • Bad Testing
      • End to end tests
      • Honeycomb
      • Testing Microservices
      • Mutation testing
      • Property based testing
      • Smoke Testing
      • social-unit-tests
      • solitary-unit-tests
      • Static Analysis Test
      • Unit testing
  • Version Control - Git
    • Branch by Abstraction
    • feature-branching
    • Git patches
    • Trunk Based Development
Powered by GitBook
On this page
  • HashMap
  • Full buckets
  • Load factor
  • Treeifying
  • Links
  • Links
  • Why hashcode is used is in hashmap
  • Implementation
  • Iteration order
  • Complexity
  • Links

Was this helpful?

  1. Languages
  2. Java
  3. Basics of java
  4. Data structures

Map

PreviousArayslist/listNextEffective Java notes

Last updated 2 years ago

Was this helpful?

  • is an interface that maintains key-to-value relationships, and retains only unique keys

  • If the same key and different value is added to a map, the old value is replaced by the new value.

  • Implementations

    • hashmap

    • treemap

    • LinkedHashMap

  • Other languages call it a dictionary

HashMap

  • implementation of Map

  • hash table, but with a few additional embellishments

  • Initially the bucket entries will be stored in a list.

    • A list of linked lists

  • When it comes to finding a value, the hash of the key is calculated and then the equals() method is used to find the key in the list.

    • Because of the mechanics of hashing the key and using equality to find it in the list, duplicate keys are not permitted in a HashMap.

    • Inserting the same key simply replaces the key currently stored in the HashMap.

  • initialCapacity and loadFactor affect performance

    • can be set via constructor args

    • The capacity of a HashMap represents the current number of buckets that have been created, which defaults to 16.

      • An accurate initialCapacity will avoid the need to automatically rehash as the table grows.

    • The loadFactor represents how full the hash table is allowed to get before the capacity is automatically increased.

      • Increasing the capacity and recalculating hashes is a procedure known as rehashing, which doubles the number of buckets available and redistributes the data stored.

      • It is also possible to tweak loadFactor, but 0.75 (the default) provides a good balance between space and time of access.

      • A loadFactor of higher than 0.75 will reduce the rehashing requirement, but access will become slower as buckets will generally become fuller

      • Setting the initialCapacity to the maximum number of elements divided by the loadFactor will prevent a rehash operation from occurring.

  • operations

    • constant-time support for get() and put() operations, but iteration can be costly.

      • high loadfactor and initialcapacity can impact iteration

Full buckets

  • Due to using hashcode for the hashtable, there are around 4billion available numbers (range of an int).

    • but the number of objects you could choose to create is much larger. Therefore some objects must share the same hash code, by the pigeonhole principle.

  • This problem of keys sharing buckets, leads to the issue of how to find the correct entry (key-value) when they have the same hashcode

    • This leads to the bucket containing a linkedlist, to being changed to a red black tree, when the number of elements associated with a bucket reaches a threshold value

    • The use of the tree as default is not worth it for small numbers in linked list

    • So the hashcode is used to check the buckets, if no buckets then can create new linked list in bucket of the entry

      • if bucket exists it will use equals, if not equals it will add it to the linked list or tree. Otherwise overwrite if equals

Load factor

  • Is a setting, that states when the map must resize the capacity (increase the number of buckets) when the number of buckets are filling up

  • default is 75%

  • if not set, then resize does not happen, then collisions will happen, and multiple keys will be set for each bucket (linked list created)

  • As hashcode function is used, this means that their is a limit to unique hashcodes, and thus collisions will happen

  • resizing, includes rehashing, which makes this operation expensive if load factor is too low

    • also may redistribute entries in a linked list, to a bucket

  • https://www.baeldung.com/java-hashmap-load-factor

Treeifying

  • From Java8 onwards

  • internal aspect of hashmap

  • when a bucket becomes highly populated. If the bucket elements are implemented as a LinkedList, traversal to find an element becomes on average more expensive as the list grows. O(n)

  • To reduce this linear affect, the bucket turns from list to tree (treeenodes), when the treeifying_threshold is reached

  • Adding to a linked list for a bucket, increase cost of insertion into map

  • This is not done at the beginning

    • tree nodes are double the size of list nodes

    • A well-distributed hashing function will rarely cause buckets to be converted to TreeNodes.

  • If this happens needs to change initialCapacity and loadFactor

  • Implements a red-black tree structure, for better performance in searching

    • Offers worst case guarentees

  • If the hash codes are the same, it uses the compareTo method of Comparable if the objects implement that interface, else the identity hash code.

  • Implications

    • Can set up the hashcode to return a constant, the Treeifying value to 1. Then you get red and black tree data structure. Where insertion, search and deletion will o(logN).

      • But this is already implemented in java as a TreeMap

Links

  • https://runzhuoli.me/2018/08/31/the-secret-improvement-of-hashmap-in-java8.html

Links

  • https://stackoverflow.com/questions/30164087/how-does-java-8s-hashmap-degenerate-to-balanced-trees-when-many-keys-have-the-s/30180593

  • https://stackoverflow.com/questions/43911369/hashmap-java-8-implementation

  • https://www.youtube.com/watch?v=NrMaQL_4Npo

Why hashcode is used is in hashmap

  • The idea of a hashtable is that you want to be able to realize a datastructure called a dictionary in an efficient way. A dictionary is a key/value store, i.e., you want to be able to store certain objects under a certain key and later on be able to retrieve them again using the same key.

  • One of the most efficient ways to access values is to store them in an array. For instance, we could realize a dictionary that uses integers for keys and Strings for values like so:

String[] dictionary = new String[DICT_SIZE];
dictionary[15] = "Hello";
dictionary[121] = "world";

System.out.println(dictionary[15]); // prints "Hello"
  • Unfortunately, this approach is not very general at all: the index of an array has to be an integer value, but ideally we'd like to be able to use arbitrary kinds of objects for our keys, not only integers.

  • Now, the way to solve this point is to have a way of mapping arbitrary objects to integer values which we could then use as keys for our array. In Java, that's what hashCode() does. So now, we could try to implement a String -> String dictionary:

String[] dictionary = new String[DICT_SIZE];
// "a" -> "Hello"
dictionary["a".hashCode()] = "Hello";

// "b" -> "world"
dictionary["b".hashCode()] = "world";

System.out.println(dictionary["b".hashCode()]); // prints world
  • But hey, what if there is some object which we'd like to use as a key, but its hashCode method returns a value that's greater than or equal to DICT_SIZE? Then we'd get an ArrayIndexOutOfBoundsException and that would be undesirable. So, let's just make it as big as we can, right?

public static final int DICT_SIZE = Integer.MAX_VALUE // Ooops!
  • But that would mean that we would have to allocate ginormeous amounts of memory for our array, even if we only intend to store a few items. So that can't be the best solution, and in fact we can do better. Let's assume we had a function h that for any given DICT_SIZE maps arbitrary integers into the range [0, DICT_SIZE[. Then we could just apply h to whatever the hashCode() method of a key object returns and be certain that we stay in the boundaries of the underlying array

public static int h(int value, int DICT_SIZE) {
    // returns an integer >= 0 and < DICT_SIZE for every value.
}
  • That function is called a hash function. Now we can adapt our dictionary implementation to avoid the ArrayIndexOutOfBoundsException:

// "a" -> "Hello"
dictionary[h("a".hashCode(), DICT_SIZE)] = "Hello"

// "b" -> "world"
dictionary[h("b".hashCode(), DICT_SIZE)] = "world"
  • But that introduces another problem: what if h maps two different key indices to the same value? For instance:

int keyA = h("a".hashCode(), DICT_SIZE);
int keyB = h("b".hashCode(), DICT_SIZE);
keyA == keyB; //return true
  • may yield the same values for keyA and keyB, and in that case we would accidentally overwrite a value in our array:

// "a" -> "Hello"
dictionary[keyA] = "Hello";

// "b" -> "world"
dictionary[keyB] = "world"; // DAMN! This overwrites "Hello"!!

System.out.println(dictionary[keyA]); // prints "world"
  • Well, you may say, then we just have to make sure that we implement h in such a way that this can never happen. Unfortunately, this isn't possible in general. Consider the following code:

for (int i = 0; i <= DICT_SIZE; i++) {
    dictionary[h(i, DICT_SIZE)] = "dummy";
}
  • This loop stores DICT_SIZE + 1 values (always the same value, actually, namely the String "dummy") in the dictionary. Mhh, but the array can only store DICT_SIZE different entries! That means, when we use h, we would overwrite (at least) one entry. Or in other words, h will map two different keys to the same value! These "collisions" can't be avoided: if n pigeons try to go into n-1 pigeon holes, at least two of them have to go into the same hole.

  • But what we can do is to extend our implementation so that the array can store multiple values under the same index. This can easily be done by using lists (a linked list). So instead of using:

String[] dictionary = new String[DICT_SIZE];
  • we write :

List<String>[] dictionary = new List<String>[DICT_SIZE];
// Side remark: note that Java doesn't allow the creation of arrays of generic types, so the above line wouldn't compile...but it's the idea
  • That will change the access to the dictionary as follows:

// "a" -> "Hello"
dictionary[h("a".hashCode(), DICT_SIZE)].add("Hello");

// "b" -> "world"
dictionary[h("b".hashCode(), DICT_SIZE)].add("world");
  • In case our hashfunction h returns different values for all our keys, this will result in lists with only one element each, and retrieving elements is really simple:

System.out.println(dictionary[h("a".hashCode(), DICT_SIZE)].get(0)); // "Hello"
  • But we already know that in general h will map different keys to the same integer sometimes. In these cases, the lists will contain more than one value. For retrieval, we have to go through the whole list to find the "correct" value, but how would we recognize it?

  • Well, instead of storing the value alone, we could always store the complete (key,value) pair in the lists. Then lookup would be performed in two steps:

    • Apply the hashfunction to retrieve the correct list from the array.

    • Iterate through all pairs stored in the retrieved list: if the pair with the desired key is found, return the value from the pair.

  • Now adding and retrieving have become so complex that it's not indecent to treat ourselves separate methods for these operations:

List<Pair<String,String>>[] dictionary = List<Pair<String,String>>[DICT_SIZE];

public void put(String key, String value) {
    int hashCode = key.hashCode();
    int arrayIndex = h(hashCode, DICT_SIZE);

    List<Pair<String,String>> listAtIndex = dictionary[arrayIndex];
    if (listAtIndex == null) {
        listAtIndex = new LinkedList<Pair<Integer,String>>();
        dictionary[arrayIndex] = listAtIndex;
    }

    for (Pair<String,String> previouslyAdded : listAtIndex) {
        if (previouslyAdded.getKey().equals(key)) {
            // the key is already used in the dictionary,
            // so let's simply overwrite the associated value
            previouslyAdded.setValue(value);
            return;
        }
    }

    listAtIndex.add(new Pair<String,String>(key, value));
}

public String get(String key) {
    int hashCode = key.hashCode();
    int arrayIndex = h(hashCode, DICT_SIZE);

    List<Pair<String,String>> listAtIndex = dictionary[arrayIndex];
    if (listAtIndex != null) {
        for (Pair<String,String> previouslyAdded : listAtIndex) {
            if (previouslyAdded.getKey().equals(key)) {
                return previouslyAdded.getValue(); // entry found!
            }
        }
    }

    // entry not found
    return null;
}
  • So, in order for this approach to work, we actually need two comparison operations: the hashCode method to find the list in the array (this works fast if hashCode() and h are both fast) and an equals method which we need when going through the list.

  • This is the general idea of hashing, and you will recognize the put and get method from java.util.Map. Of course, the above implementation is an oversimplification, but it should illustrate the gist of it all.

  • Naturally, this approach is not limited to Strings, it works for all kinds of objects, since the methods hashCode() and equals are members of the top-level class java.lang.Object and all other classes inherit from that one.

  • As you can see, it doesn't really matter if two distinct objects return the same value in their hashCode() method: the above approach will always work! But still it is desirable that they return different values to lower the chances for hash collisions produced by h. We have seen that these can't be avoided 100% in general, but the less collisions we get, the more efficient our hashtable becomes. In the worst case, all keys map to the same array index: in that case, all pairs are stored in a single list and finding a value will then become an operation with costs linear in the size of the hashtable.

  • Note

    • If the hash function returns the same number, then all the key-value pairs will be stored in one bucket/pigeonhole, and thus you will be left with linkedlist of key-value pair. Thus losing the efficiency of the hashing function, ie get the value from key.

      • Aim is to implement a good hashcode, that will evenly spread out the hashnumbers, and thus the buckets that will be filled. Generally, use of prime numbers will be invovled

    • Due to hashcode and equuals being used in hashmap, they need to be implemented (esp if equals is overridden)

    • Keys should be immutable, to maintain contract of equality. Others will break searchign for keys to get value

      • Enums are good for this

Implementation

  • https://www.youtube.com/watch?v=1Ovg3IC-p5A

Iteration order

  • https://peterchng.com/blog/2022/06/17/what-iteration-order-can-you-expect-from-a-java-hashmap/

Complexity

  • https://javabypatel.blogspot.com/2015/10/time-complexity-of-hashmap-get-and-put-operation.html

Links

  • https://beginnersbook.com/2013/12/hashmap-in-java-with-example/

  • https://www.javacodegeeks.com/2017/11/java-hashmap-detail-explanation.html

  • https://dzone.com/articles/how-to-use-java-hashmap-effectively

  • https://www.youtube.com/watch?v=NrMaQL_4Npo

  • https://youtu.be/bnWprR-QG9w

Map
HashMap
Full buckets
Load factor
Treeifying
Links
Links
Why hashcode is used is in hashmap
Implementation
Iteration order
Complexity
Links