Home | 簡體中文 | 繁體中文 | 雜文 | 打賞(Donations) | Github | OSChina 博客 | 雲社區 | 雲棲社區 | Facebook | Linkedin | 知乎專欄 | 視頻教程 | About

16.3. Keyspace

16.3.1. Schema

16.3.1.1. Keyspace

16.3.1.2. Column family

16.3.1.2.1. Name
16.3.1.2.2. Column
16.3.1.2.3. Super column
16.3.1.2.4. Sorting

16.3.2. Keyspace example

例 16.1. Twitter

				
<Keyspace Name="Twitter">
<ColumnFamily CompareWith="UTF8Type" Name="Statuses" />
<ColumnFamily CompareWith="UTF8Type" Name="StatusAudits" />
<ColumnFamily CompareWith="UTF8Type" Name="StatusRelationships"
CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" />
<ColumnFamily CompareWith="UTF8Type" Name="Users" />
<ColumnFamily CompareWith="UTF8Type" Name="UserRelationships"
CompareSubcolumnsWith="TimeUUIDType" ColumnType="Super" />
</Keyspace>
				
				

例 16.2. Twissandra

				
  <Keyspaces>
    <Keyspace Name="Twissandra">
       <ColumnFamily CompareWith="UTF8Type" Name="User"/>
      <ColumnFamily CompareWith="BytesType" Name="Username"/>
      <ColumnFamily CompareWith="BytesType" Name="Friends"/>
      <ColumnFamily CompareWith="BytesType" Name="Followers"/>
      <ColumnFamily CompareWith="UTF8Type" Name="Tweet"/>
      <ColumnFamily CompareWith="LongType" Name="Timeline"/>
      <ColumnFamily CompareWith="LongType" Name="Userline"/>

      <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>

      <!-- Number of replicas of the data -->
      <ReplicationFactor>1</ReplicationFactor>
      <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>

    </Keyspace>
  </Keyspaces>
  				
				

Schema Layout

In Cassandra, the way that your data is structured is very closely tied to how how it will be retrieved. Let's start with the user ColumnFamily. The key is a user id, and the columns are the properties on the user:

User = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        'id': 'a4a70900-24e1-11df-8924-001ff3591711',
        'username': 'ericflo',
        'password': '****',
    },
}
				

Since some of the URLs on the site actually have the username, we need to be able to map from the username to the user id:

Username = {
    'ericflo': {
        'id': 'a4a70900-24e1-11df-8924-001ff3591711',
    },
}
				

Friends and followers are keyed by the user id, and then the columns are the friend user id and follower user ids, and we store a timestamp as the value because it's interesting information to have:

Friends = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # friend id: timestamp of when the friendship was added
        '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791',
        '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949',
        '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277',
    },
}

Followers = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # friend id: timestamp of when the followership was added
        '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791',
        '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949',
        '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277',
    },
}
				

Tweets are stored in a way similar to users:

Tweet = {
    '7561a442-24e2-11df-8924-001ff3591711': {
        'id': '89da3178-24e2-11df-8924-001ff3591711',
        'user_id': 'a4a70900-24e1-11df-8924-001ff3591711',
        'body': 'Trying out Twissandra. This is awesome!',
        '_ts': '1267414173047880',
    },
}
				

The Timeline and Userline column families keep track of which tweets should appear, and in what order. To that effect, the key is the user id, the column name is a timestamp, and the column value is the tweet id:

Timeline = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # timestamp of tweet: tweet id
        1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
        1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
        1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
        1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
    },
}

Userline = {
    'a4a70900-24e1-11df-8924-001ff3591711': {
        # timestamp of tweet: tweet id
        1267414247561777: '7561a442-24e2-11df-8924-001ff3591711',
        1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711',
        1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711',
        1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711',
    },
}