christova

Tech Articles – (please note these posts are collated from AmigosCode, Alex Xu and many others. Full copyright to the owners of their material)

How do fintech startups find new opportunities among so many payment companies? What do PayPal, Stripe, and Square do exactly?

Steps 0-1: The cardholder opens an account in the issuing bank and gets the debit/credit card. The merchant registers with ISO (Independent Sales Organization) or MSP (Member Service Provider) for in-store sales. ISO/MSP partners with payment processors to open merchant accounts.

Steps 2-5: The acquiring process. The payment gateway accepts the purchase transaction and collects payment information. It is then sent to a payment processor, which uses customer information to collect payments. The acquiring processor sends the transaction to the card network. It also owns and operates the merchant’s account during settlement, which doesn’t happen in real-time.

Steps 6-8: The issuing process. The issuing processor talks to the card network on the issuing bank’s behalf. It validates and operates the customer’s account.

#payments #ecommerce

Read more...

How to scale a website to support millions of users? We will explain this step-by-step.

The diagram below illustrates the evolution of a simplified eCommerce website. It goes from a monolithic design on one single server, to a service-oriented/microservice architecture.

Suppose we have two services: inventory service (handles product descriptions and inventory management) and user service (handles user information, registration, login, etc.).

Step 1 – With the growth of the user base, one single application server cannot handle the traffic anymore. We put the application server and the database server into two separate servers.

Step 2 – The business continues to grow, and a single application server is no longer enough. So we deploy a cluster of application servers.

Step 3 – Now the incoming requests have to be routed to multiple application servers, how can we ensure each application server gets an even load? The load balancer handles this nicely.

Step 4 – With the business continuing to grow, the database might become the bottleneck. To mitigate this, we separate reads and writes in a way that frequent read queries go to read replicas. With this setup, the throughput for the database writes can be greatly increased.

Step 5 – Suppose the business continues to grow. One single database cannot handle the load on both the inventory table and user table. We have a few options:

1. Vertical partition. Adding more power (CPU, RAM, etc.) to the database server. It has a hard limit.

2. Horizontal partition by adding more database servers.

3. Adding a caching layer to offload read requests.

Step 6 – Now we can modularize the functions into different services. The architecture becomes service-oriented / microservice.

#scaling #scale #eCommerce

REST: Sends HTTP requests from client to server and back. Widely used over HTTP.

GraphQL: Lets clients pull specific data in one query. Offers more control than REST.

WebSocket: Keeps a live connection for real-time, two-way communication.

gRPC: Uses Protocol Buffers for efficient messaging, often over HTTP/2.

MQTT: Lightweight protocol for IoT, sends messages via a broker.

Serverless: Executes APIs on demand, no server management needed. Perfect for cloud.

#APIs #architecture

How do live streaming platforms like YouTube Live, TikTok Live, or Twitch work? Live streaming is challenging because the video content is sent over the internet in near real-time. Video processing is compute-intensive. Sending a large volume of video content over the internet takes time. These factors make live streaming challenging. The diagram below explains what happens behind the scenes to make this possible.

Step 1: The streamer starts their stream. The source could be any video and audio source wired up to an encoder

Step 2: To provide the best upload condition for the streamer, most live streaming platforms provide point-of-presence servers worldwide. The streamer connects to a point-of-presence server closest to them.

Step 3: The incoming video stream is transcoded to different resolutions, and divided into smaller video segments a few seconds in length.

Step 4: The video segments are packaged into different live streaming formats that video players can understand. The most common live-streaming format is HLS, or HTTP Live Streaming.

Step 5: The resulting HLS manifest and video chunks from the packaging step are cached by the CDN.

Step 6: Finally, the video starts to arrive at the viewer’s video player.

Step 7-8: To support replay, videos can be optionally stored in storage such as Amazon S3.

#streaming

Microservices architecture has become popular for building complex, scalable software systems.

This architectural style structures an application as a collection of loosely coupled, independently deployable services. Each microservice is focused on a specific business capability and can be developed, deployed, and scaled independently.

While microservices offer numerous benefits, such as improved scalability, flexibility, and faster time to market, they also introduce significant challenges in terms of data management.

One of the fundamental principles of microservices architecture is that each service should own and manage its data. This principle is often expressed as “don't share databases between services” and it aims to ensure loose coupling and autonomy among services, allowing them to evolve independently.

However, it's crucial to distinguish between sharing a data source and sharing data itself. While sharing a data source (e.g., a database) between services is discouraged, sharing data between services is often necessary and acceptable.

In this post, we’ll look at different ways of sharing data between microservices and the various advantages and disadvantages of specific approaches.

Sharing Data Source vs Sharing Data

As mentioned earlier, it's crucial to understand the difference between sharing a data source and sharing data itself when working with microservices.

This distinction has significant implications for service coupling, independence, and overall system design.

Sharing a data source means multiple services directly access the same database, potentially leading to tight coupling and dependencies. It violates the principle of service data ownership in microservices.

On the other hand, sharing data involves services exchanging data through well-defined APIs or messaging patterns, maintaining their local copies of the data they need. This helps preserve service independence and loose coupling.

Let's consider an example scenario involving an Order service and a Product service to illustrate the data-sharing dilemma.

  • Order Service:
    • Manages order information.
    • Stores order details in its database.
  • Product Service:
    • Handles product information.
    • Maintains a separate database for product details.
  • Relationship:
    • Each order is associated with a set of products

The challenge arises when the Order service needs to display product information alongside order details.

In a monolithic architecture, this would be straightforward since all the data resides in a single database. However, in a microservices architecture, the Order service should not directly access the Product service database. Instead, it should fetch the data via the Product Service.

This scenario highlights the need for effective data-sharing strategies in microservices:

  • Data Independence: Each service owns its data, preventing direct database access between services.
  • Data Retrieval: The Order service must find a way to retrieve product information without violating service boundaries.
  • Performance Considerations: The chosen data-sharing approach should not significantly impact system performance or introduce excessive latency.
  • Consistency: Ensuring data consistency between services becomes crucial when sharing data across service boundaries.

Let us now look at different ways of sharing data.

Synchronous Data Sharing in Microservices

Synchronous data sharing in microservices involves real-time communication between services, where the requesting service waits for a response before proceeding. This approach ensures immediate consistency but introduces challenges in terms of scalability and performance.

Request/Response Model

The request/response model is the most straightforward synchronous approach:

  • Service A sends a request to Service B for data.
  • Service B processes the request and sends back a response.
  • Service A waits for the response before continuing its operation.

For example, whenever the Order Service is called to provide order details (which includes product information), it has to fetch product data by calling the Product Service.

As expected, the synchronous data-sharing approaches face several challenges:

  • Increased Response Time: Each synchronous call adds to the overall response time of the system.
  • Cascading Failures: If one service is slow or down, it can affect the entire chain of dependent services.
  • Resource Utilization: Services may need to keep connections open while waiting for responses, potentially leading to resource exhaustion.
  • Network Congestion: As the number of inter-service calls increases, network traffic can become a bottleneck.

Due to these challenges, services can share data using duplication. For example, storing the product name with order details within the order table. This way the Order Service does not have to contact the Product Service to fetch the product’s name.

However, this creates consistency issues during data updates. For example, the product name is now duplicated across two different services. If the product name gets changed, the updates have to be done in both the product table and the order table to maintain consistency.

Note that this is just a simple example for explanation purposes. Even more critical consistency requirements can exist in a typical application.

The Gateway Approach

The Gateway approach is a simple approach to ensure data consistency while sharing data across multiple services.

Here’s an example scenario of how it can work:

  • A coordinator that acts like a gateway initiates the update process.
  • The coordinator sends update requests to all participating services.
  • Each service performs the update and sends back a success or failure response.
  • If all services succeed, the transaction is considered complete. If any service fails, the coordinator must initiate a rollback on all services.

See the diagram below where a possible change in the product name requires updates in the order service where a copy of the product name is shared. The coordinator ensures that both the updates are successful before informing the user.

Some advantages of the gateway approach are as follows:

  • Simpler to implement and reason about
  • Maintains consistency

However, there are also disadvantages:

  • Less reliable in failure scenarios.
  • Poor experience for the users in case one of the services fails.
  • Difficult to handle partial failures.

As evident, while synchronous approaches offer strong consistency and simplicity, they come with significant trade-offs in terms of scalability, performance, and resilience.

This is also the reason why asynchronous approaches are usually more popular.

Asynchronous Data Sharing in Microservices

Asynchronous data sharing in microservices architectures enables services to exchange data without waiting for immediate responses. This approach promotes service independence and enhances overall system scalability and resilience.

Let's explore the key components and concepts of asynchronous data sharing.

Event-Driven Architecture

In an event-driven architecture, services communicate through events:

  • When a service performs an action or experiences a state change, it publishes an event to a message broker.
  • Other services interested in that event can subscribe to it and react accordingly.
  • This loosely coupled approach allows services to evolve independently.

The example below describes such an event-driven approach

  • The Order Service publishes an “OrderCreated” event when a new order is placed.
  • The Inventory Service subscribes to the “OrderCreated” event and updates stock levels.
  • The Shipping Service also subscribes to the event and initiates the shipping process.

### Message Queues

Message queues, such as RabbitMQ, serve as the backbone of asynchronous communication in microservices: * They provide a reliable and scalable way to exchange messages between services. * Services can publish messages to specific topics or queues. * Other services can consume those messages at their own pace.

### Eventual Consistency

Asynchronous data sharing often leads to eventual consistency.

Services maintain their local copies of data and update them based on received events. It means that data across services may be temporarily inconsistent but will eventually reach a consistent state.

This approach allows services to operate independently and improves performance by reducing synchronous communication.

Here’s an example of the impact of eventual consistency on the earlier example about data sharing between Product and Order Service: * The Product Service maintains a local copy of the product data. * When a product’s details (such as name) are updated, the Product Service publishes a “ProductUpdated” event. * The Order Service consume the event and updates their local copies of the product data asynchronously. * There may be a short period where product data is inconsistent across services, but it will eventually become consistent.

See the diagram below:

The advantages of asynchronous data sharing are as follows: * Loose Coupling: Services can evolve independently without tight dependencies on each other. * Scalability: Services can process messages at their own pace, allowing for better scalability and resource utilization. * Resilience: If a service is temporarily unavailable, messages can be buffered in the queue and processed later, improving system resilience. * Improved Performance: Asynchronous communication reduces the need for synchronous requests, resulting in faster response times and improved overall performance.

However, there are also some disadvantages: * Eventual Consistency: Data may be temporarily inconsistent across services, which can be challenging to handle in certain scenarios. * Increased Complexity: Implementing event-driven architectures and handling message queues adds complexity to the system. * Message Ordering: Ensuring the correct order of message processing can be challenging, especially in scenarios where message ordering is critical. * Error Handling: Dealing with failures and errors in asynchronous communication requires careful design and implementation of retry mechanisms and compensating actions.

## Hybrid Approach

A hybrid approach to data sharing in microservices combines elements of both synchronous and asynchronous methods. This approach aims to balance the benefits of local data availability with the need for up-to-date and detailed information from other services.

Let's consider a ride-sharing application with two microservices: Driver and Ride. * The Ride service stores basic driver information (ID, name) locally. * For detailed information (e.g., driver rating, car details), the Ride service makes a synchronous call to the Driver service.

This hybrid approach allows the Ride service to have quick access to essential driver data while still retrieving more comprehensive information when needed.

See the diagram below:

The hybrid approach offers a couple of benefits such as: * Reduces Redundant Calls: By storing basic information locally, the Ride service can reduce the frequency of service-to-service calls. This improves performance and minimizes the load on the network. * Smaller Database Footprint: Only essential data is duplicated across services, resulting in a smaller database footprint. This saves storage space and reduces the overhead of managing redundant data.

However, it also has some drawbacks: * Still Affects Response Time: Although basic information is available locally, retrieving detailed information through synchronous calls to the Driver service still impacts the overall response time of the Ride service. * Some Data Duplication: Even though only essential data is duplicated, there is still some level of data duplication across services. This requires careful management to ensure data consistency and avoid discrepancies.

​​When implementing a hybrid approach, consider the following: * Data Consistency: Ensure that the duplicated data remains consistent across services. Implement mechanisms to update the local copy of data when changes occur in the source service. * Caching Strategies: Employ caching strategies to store frequently accessed detailed information locally, reducing the need for repetitive synchronous calls to other services. * Asynchronous Updates: Consider using asynchronous methods, such as event-driven architecture or message queues, to propagate updates to the duplicated data asynchronously, minimizing the impact on response times.

## Trade-offs and Decision Making

When designing a microservices architecture that involves data sharing, it's crucial to carefully evaluate the trade-offs between consistency, performance, and scalability.

This decision-making process requires a deep understanding of your system's requirements and constraints.

### 1 – Evaluating Consistency Requirements

One of the key factors in choosing an appropriate solution is the consistency requirement.

#### Strong Consistency

Strong consistency ensures that all services have the most up-to-date data at all times. It is suitable for systems where data accuracy is critical, such as financial transactions or medical records.

The main characteristics are as follows: * Often implemented using synchronous communication patterns like the gateway pattern or request-response approach. * Ensures immediate data consistency across all services. * Drawbacks include higher latency and reduced availability.

#### Eventual Consistency

Eventual consistency allows for temporary data inconsistencies but guarantees that data will become consistent over time. It is appropriate for systems that can tolerate temporary inconsistencies, such as social media posts or product reviews.

The primary characteristics of a system based on eventual consistency are as follows: * Implemented using asynchronous communication patterns like event-driven architecture. * Allows for faster response times and higher availability. * Requires careful design to handle conflict resolution and compensating transactions.

### 2 – Performance Considerations

The key performance-related considerations are as follows: * Latency * Strong consistency often leads to higher latency due to synchronous communication. * Eventual consistency can provide lower latency but may serve stale data. * Throughput * Eventual consistency generally allows for higher throughput as services can operate more independently. * Strong consistency may limit throughput due to the need for coordination between services. * Resource Utilization * Strong consistency may require more computational and network resources to maintain data integrity. * Eventual consistency can be more efficient in terms of resource usage but requires additional storage for local data copies.

### 3 – Scalability

The data-sharing solution has a significant impact on the application's scalability. Some of the key aspects are as follows: * Horizontal Scaling: Eventual consistency models often scale better horizontally as services can be added without significantly impacting overall system performance. Strong consistency models may face challenges in horizontal scaling due to increased coordination overhead. * Data Volume: As data volume grows, maintaining strong consistency across all services becomes challenging. Eventual consistency can handle large data volumes more gracefully but requires careful management of data synchronization. * Service Independency: Eventual consistency allows services to evolve more independently, facilitating easier scaling. Strong consistency models create tighter coupling between services.

## Summary

In this article, we’ve taken a detailed look at the important topic of sharing data between microservices. This is a critical requirement when it comes to building robust and high-performance microservices.

Let’s summarize the key learnings: * The principle of each service owning its data aims to ensure loose coupling and autonomy among services, allowing them to evolve independently. * However, it's crucial to distinguish between sharing a data source and sharing data itself. * Synchronous data sharing in microservices involves real-time communication between services, where the requesting service waits for a response before proceeding. * Some approaches for synchronous data sharing are the request/response model and the gateway approach * Asynchronous data sharing in microservices architectures enables services to exchange data without waiting for immediate responses. * The key components of the asynchronous data-sharing approach are event-driven architecture, message queues, and the concept of eventual consistency. * A hybrid approach to data sharing in microservices combines elements of both synchronous and asynchronous methods. This approach aims to balance the benefits of local data availability with the need for up-to-date and detailed information from other services. * Multiple trade-offs need to be taken into consideration while choosing the appropriate data-sharing approach. Some key factors are consistency requirements, performance needs, and the desired scalability of the application.

#MicroServices #data #DataSharing

Enter your email to subscribe to updates.