Hexo

发表于2022-03-04|更新于2022-03-04|Distributed System•MIT6.824

MIT6.824 Lab3A 实现笔记MIT6.824的第三个实验要求我们在Lab2中实现的Raft基础上完成一个容灾的key-value store服务。整个实验分为两个部分，即Lab3A和Lab3B；前者要求我们在不考虑Snapshot的情况下实现这样一个key-value store的基本Put/Append/Get功能，后者则要求我们在Lab3A的基础上实现snapshot以帮助crash的kv server快速恢复状态。这里将实现Lab3A的过程中的一些问题进行记录, 但根据课程的要求，这里并不直接贴出实现代码，而仅仅提一下思路。基本设计课程网站上的diagram给出了整个分布式key-value store的架构： Clerk负责接收来自Client的请求并将请求发送到Kv server上进行处理 KVServer是一个本地的key-value store，其后台运行着一个raft routine来确保log entry的replication正常进行。KVServer需要对来自Clerk的读写请求进行相应的操作。写请求的大致流程如下： Clerk接收到Put/ ...

LevelDB阅读笔记:key相关数据结构

发表于2022-02-24|更新于2022-09-06|CodingDatabase|LevelDB•C++•KV

LevelDB中key相关的数据结构LevelDB内部以及提供给用户的接口使用多种数据结构来表示不同的key，每种key都有自己的内部结构以及含义。其大致关系如下图所示： SliceSlice意为切片，即对字节流一部分的引用。此概念在诸如Python等高级语言中是语言内置特性。在LevelDB中，Slice的作用是引用字符串以减少在传递字符串数据时导致的copy开销。Slice被定义在include/leveldb/slice.h中，是暴露给外界使用的公共类。 Slice由于只提供对字节流的引用，本身不负责对数据的管理，所以其包含的数据成员相对简单：仅包含引用字节流的开始位置以及引用的大小 class LEVELDB_EXPORT Slice { public: // Create an empty slice. Slice() : data_(""), size_(0) {} // Create a slice that refers to d[0,n-1]. Slice(const char* d, size_t n) : data ...

MICA Reading Notes

发表于2022-02-22|更新于2022-02-22|key-value store•database

MICA: A Holistic Approach to Fast In-Memory Key-Value StorageOverviewMICA is an in-memory key-value store system that enables fast Put and Get operation. In this paper, the author proposes three aspects of designing an in-memory key-value store, as follows: The three aspects includes: Use data partition or shared data replacement strategy? How to effectively process network request packets? Design data structures for fast processing requests Based on above design aspects, the author designs ...

LevelDB阅读笔记:读写流程

发表于2022-02-16|更新于2022-09-06|CodingDatabase|LevelDB•C++•KV

DB内部读写流程上一节已经说过，class DB只是提供接口功能的抽象基类，leveldb通过DBImpl类来实现具体的接口功能： // db_impl.h class DBImpl : public DB { ... }; 出于简单考虑，本节只会分析读写流程，并且由于这部分涉及到很多其他组件的实现，我们并不能将所有实现细节完全讲出，而只能给出一个大概的执行过程。但这一部分毫无疑问是阅读其他代码组件的基础，正是从接口实现出发，我们可以沿着这一条主线前进，并且在遇到涉及其他组件的时候阅读其内部实现细节，从而建立对leveldb的整体认识。尽管这种方法不一定最好，但对于我这类初学者来说，应该是最容易想到和实施的方法了。 DBImpl的Put,Delete对原始的接口进行了override，如下所示： class DBImpl : public DB { DBImpl(const Options& options, const std::string& dbname); DBImpl(const DBImpl&) = del ...

LevelDB阅读笔记:DB接口

发表于2022-01-26|更新于2022-09-06|CodingDatabase|LevelDB•C++•KV

LevelDB使用接口class DB在leveldb中，作者提供了class DB这个抽象基类作为整个levelbd库暴露给外界的接口，这个基类包含了leveldb这个kv系统应该支持的操作接口，例如打开数据库，关闭数据库，读写操作，快照等等。该类放置在文件include/leveldb/db.h中。除了抽象基类db，其他的一些需要被用户使用到的对象，包括：用于打开DB的Option类，用于进行读写操作的ReadOption, WriteOption等等也在该文件中被声明，其定义可以在include/leveldb/option.h中找到。总的来说，整个db.h文件的构建如下： struct Options; struct ReadOptions; struct WriteOptions; class WriteBatch; class LEVELDB_EXPORT DB { public: static Status Open(const Options& options, const std::string& name, ...

BigTable Reading Notes

发表于2022-01-26|更新于2022-01-27|Computer Science|Distributed System•MIT6.824

Bigtable: A Distributed Storage System for Structured DataWhat is Bigtable?The title of this paper might be a little confusing since it contains two key words “Table” and “Structured Data”. One may refer Bigtable as a distributed relational database system but it’s actually not. The paper has defined BigTable explicitly: A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterp ...

GFS Reading Notes

发表于2022-01-25|更新于2022-01-25|Distributed System•MIT6.824

The Google File SystemPre-wordsBefore we started talking about GFS, we need to first talk about the basic design challenges of distributed systems. As professor Morris mentioned in this course, the relationship between a few basic factors of distrbuted systems can be expressed as follows: To achive better performance, the system use multiple server to gain parallel execution or storage. However, multiple servers may face constant failures and thus replication is necessary for the purpose o ...

MapReduce Reading Notes

发表于2022-01-25|更新于2022-01-25|Distributed System•MIT6.824

MapReduce: Simpliﬁed Data Processing on Large ClustersOverviewMapReduce: A program model and associated implementation of large data processing. It abstracts data processing into two phase: Map and Reduce. The MapReduce system itself takes full charge in system related problems, e.g. server failure, worker scheduling, which enables the user (with no experience in distributed computing) to only consider designing proper Map and Reduce function for their specific purpose. This 2004 OSDI paper dep ...