Skip to content

Project Outline

Aditya edited this page Mar 28, 2021 · 3 revisions

To create a distributed processing system of matrix calculations, capable of auto-scaling workers which at the same time has client-interactive REST interface.

  • REST Interface

  • gRPC Server

  • gRPC Scaling

  • gRPC Deadline Foot-printing and Scaling

  • Code Style and Readability(optional)

  • A REST interface which allows matrices to be uploaded as files.

    • The files can be any format but I would recommend using the space character to separate elements and new lines to separate rows for simplicity.
    • The component should be able to accept matrices of arbitrary size but you can assume they are square matrices whose dimensions are powers of 2 so you can use the divide and conquer approach utilized in lab 1.
    • If a matrix is not this size the REST interface should throw an error.
    • The REST interface should also have functionality to trigger the matrix multiplication and present the results. The gRPC client code should be integrated into the REST interface to call the addBlock and multBlock functions on the gRPC server. Partial marks will be awarded for matrix multiplication which does not allow matrices of an arbitrary size.
  • A gRPC server which provides access to the addBlock and multBlock functions. These functions should be able to accept any square matrix whose dimensions are powers of 2.

    There should be multiple instantiations of the gRPC server. You should have at least eight gRPC servers threads. There are two ways to go about this. You could use a large instance with 8 cores e.g. a t2.2xlarge and configure the gRPC threading model to access each of these cores or you could have 8 small instances e.g. t2.small and configure the gRPC client to spread the load among these servers. You should have sufficient credit for the module to implement the proposed system if you manage it correctly e.g. stop instances if you are not using them. Managing your credit is part of the challenge so requests for additional credit will not be entertained. The performance improvement via scaling is not quite linear (Adding needs to be done after multiplication) but you should see a significant performance improvement as you add more gRPC servers.

  • Frequently large scale workloads use the notion of a deadline to determine how many servers should be assigned to a workload. Your system should have a deadline based scaling function. A foot-printing function should be implemented to determine the time required for multiplying one block and this should be used to determine the minimal number of servers required to achieve the deadline.

Clone this wiki locally