Pengfei Xue: Using Redis Data Types

当我们需要更好地使用Redis时, 需要了解一下Redis支持的数据类型. Redis常被认为是常驻内存的key-value存储, 事实上, Redis将所有的数据都存在内存里, 而且为了持久化存储, 会把数据写入到磁盘(Redis会根据有多少个key值修改过, 将数据写到磁盘上, 默认的是, 1分钟内1000+keys修改过, 或者15分钟内9-keys被修改), 可是, Redis并不仅仅是一种key-value存储. Redis支持五种数据结构, 在这五种数据结构中, 只有一种是典型的key-value结构. 理解这些数据结构, 并且利用这些数据结构来建模解决问题, 对于高效地利用Redis至关重要.

Reids支持内建的数据类型, 使得开发人员可以更有意义地构建数据, 这和大多数其它的NoSQL/key-value存储解决方案不同. 利用这些内建的数据类型, 在处理数据类型相关的操作时为Redis带来了意想不到的好处, 处理这些操作时, 会比在Redis之外处理更有效, 也更快.

在深入了解这些数据结构之前, 我们需要先了解一些东西, 并且在设计key的结构时记住这些.

定义的key空间需要保持一致, key可以是任意的字符, 我们可以利用一些分隔符来划分出一些命名空间. 常见的例子就是用冒号来分割, cache:project:319:tasks.
定义key的时候, 尽量地找一个合适程度, 并且含义明确的key.

Strings
The simplest data type in Redis is a string. Strings are also the typical (and frequently the sole) data type in other key-value storage engines. You can store strings of any kind, including binary data. You might, for example, want to cache image data for avatars in a social network. The only thing you need to keep in mind is that a specific value inside Redis shouldn’t go beyond 512MB of data.
EG, set users:leto "{name: leto, planet: dune, likes: [spice]}"

Lists
Lists in Redis are ordered lists of binary safe strings, implemented on the idea of a linked list. 给定一个index, 找某个特定的值, 这个操作比较低效, 在头尾操作的效率会比较高. You might want to use lists in order to implement structures such as queues, a recipe for which we’ll look into later in the book.

Hashes
Much like traditional hash tables, hashes in Redis store several fields and their values inside a specific key. Hashes are a perfect option to map complex objects inside Redis, by using fields for object attributes (example fields for a car object might be “color”, “brand”, “license plate”). EG.
hset users:goku powerlevel 9000
hget users:goku powerlevel
Sets and Sorted Sets
Sets in Redis are an unordered collection of binary-safe strings. Elements in a given set can have no duplicates. For instance, if you try to add an element wheel to a set twice, Redis will ignore the second operation. Sets allow you to perform typical set operations such as intersections and unions.
While these might look similar to lists, their implementation is quite different and they are suited to different needs due to the different operations they make available. 内存使用上, 比list用的更多.

Sorted sets are a particular case of the set implementation that are defined by a score in addition to the typical binary-safe string. This score allows you to retrieve an ordered list of elements by using the ZRANGE command.

BIG O

Redis documentation tells us the Big O notation for each of its commands. It also tells us what the factors are that influence the performance. Let's look at some examples.

The fastest anything can be is O(1) which is a constant. Whether we are dealing with 5 items or 5 million, you'll get the same performance. The sismember command, which tells us if a value belongs to a set, is O(1). sismember is a powerful command, and its performance characteristics are a big reason for that. A number of Redis commands are O(1).

Logarithmic, or O(log(N)), is the next fastest possibility because it needs to scan through smaller and smaller partitions. Using this type of divide and conquer approach, a very large number of items quickly gets broken down in a few iterations. zadd is a O(log(N)) command, where N is the number of elements already in the set.

Next we have linear commands, or O(N). Looking for a non-indexed row in a table is an O(N) operation. So is using the ltrim command. However, in the case of ltrim, N isn't the number of elements in the list, but rather the elements being removed. Using ltrim to remove 1 item from a list of millions will be faster than using ltrim to remove 10 items from a list of thousands. (Though they'll probably both be so fast that you wouldn't be able to time it.)

zremrangebyscore which removes elements from a sorted set with a score between a mini- mum and a maximum value has a complexity of O(log(N)+M). This makes it a mix. By reading the documentation we see that N is the number of total elements in the set and M is the num- ber of elements to be removed. In other words, the number of elements that'll get removed is probably going to be more significant, in terms of performance, than the total number of elements in the list.

The sort command, which we'll discuss in greater detail in the next chapter has a complexity of O(N+M*log(M)). From its performance characteristic, you can probably tell that this is one of Redis' most complex commands.

There are a number of other complexities, the two remaining common ones are O(Nˆ2) and O(CˆN). The larger N is, the worse these perform relative to a smaller N. None of Redis' commands have this type of complexity.

Pengfei Xue

Wednesday, August 1, 2012

Using Redis Data Types

No comments:

Post a Comment

About Me

Blog Archive