创建DynamoDB表及导入数据

我们先创建表,然后再导入数据,导入的数据参考: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.html

创建表

执行以下命令创建四个表,wait命令会等待表创建完成:

aws dynamodb create-table \
    --table-name ProductCatalog \
    --attribute-definitions \
        AttributeName=Id,AttributeType=N \
    --key-schema \
        AttributeName=Id,KeyType=HASH \
    --provisioned-throughput \
        ReadCapacityUnits=10,WriteCapacityUnits=5 \
    --tags Key=auto-delete,Value=no

aws dynamodb create-table \
    --table-name Forum \
    --attribute-definitions \
        AttributeName=Name,AttributeType=S \
    --key-schema \
        AttributeName=Name,KeyType=HASH \
    --provisioned-throughput \
        ReadCapacityUnits=10,WriteCapacityUnits=5 \
    --tags Key=auto-delete,Value=no

aws dynamodb create-table \
    --table-name Thread \
    --attribute-definitions \
        AttributeName=ForumName,AttributeType=S \
        AttributeName=Subject,AttributeType=S \
    --key-schema \
        AttributeName=ForumName,KeyType=HASH \
        AttributeName=Subject,KeyType=RANGE \
    --provisioned-throughput \
        ReadCapacityUnits=10,WriteCapacityUnits=5 \
    --tags Key=auto-delete,Value=no

aws dynamodb create-table \
    --table-name Reply \
    --attribute-definitions \
        AttributeName=Id,AttributeType=S \
        AttributeName=ReplyDateTime,AttributeType=S \
    --key-schema \
        AttributeName=Id,KeyType=HASH \
        AttributeName=ReplyDateTime,KeyType=RANGE \
    --provisioned-throughput \
        ReadCapacityUnits=10,WriteCapacityUnits=5 \
    --tags Key=auto-delete,Value=no

aws dynamodb wait table-exists --table-name ProductCatalog && \
aws dynamodb wait table-exists --table-name Reply && \
aws dynamodb wait table-exists --table-name Forum && \
aws dynamodb wait table-exists --table-name Thread

创建完成后,在dynamodb里可以看到四张表:

image-20241226143622471

导入数据

下载并解压文件:

wget https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/samples/sampledata.zip

unzip sampledata.zip

使用 batch-write-item 命令加载数据:

aws dynamodb batch-write-item --request-items file://ProductCatalog.json
aws dynamodb batch-write-item --request-items file://Forum.json
aws dynamodb batch-write-item --request-items file://Thread.json
aws dynamodb batch-write-item --request-items file://Reply.json

json里的数据格式如下:

kongpingfan:~/environment/dynamodb $ head -50 Forum.json                                                                              
{
    "Forum": [
        {
            "PutRequest": {
                "Item": {
                    "Name": {"S":"Amazon DynamoDB"},
                    "Category": {"S":"Amazon Web Services"},
                    "Threads": {"N":"2"},
                    "Messages": {"N":"4"},
                    "Views": {"N":"1000"}
                }
            }
        },
        {
            "PutRequest": {
                "Item": {
                    "Name": {"S":"Amazon S3"},
                    "Category": {"S":"Amazon Web Services"}
                }
            }
        }
    ]
}

第一个字段(Forum)是要插入的表名,后面是每条记录的内容

四条命令返回结果如下,提示没有未处理的条目:

    {
        "UnprocessedItems": {}
    }

image-20230107101037424

BatchGetItem

除了BatchWriteItem外,ddb也支持BatchGetItem,它最多获取100行记录。


问题:BatchGetItem访问一百条数据,是否会比GetItem访问100次更节省RCU?

答案:不会

测试:

在dynamodb中新建五条数据:

image-20210811175627937

使用batchGetItem访问:

 aws dynamodb batch-get-item --request-items file://1.json --return-consumed-capacity TOTAL

1.json内容如下:

{
    "global-table-test": {
        "Keys": [
            {
                "id": {"S": "1"}
            },
            {
                "id": {"S": "2"}
            },
            {
                "id": {"S": "4"}
            },
            {
                "id": {"S": "5"}
            },
            {
                "id": {"S": "3"}
            }
        ]
    }
}

返回结果CapacityUnits为2.5,这是由于最终一致性读,消耗一半的RCU:

image-20210811175717922

参考:https://docs.aws.amazon.com/cli/latest/reference/dynamodb/batch-get-item.html

https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html