Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update table rest API is not working #102

Open
2 of 8 tasks
aditya-sjsu opened this issue May 9, 2024 · 2 comments
Open
2 of 8 tasks

Update table rest API is not working #102

aditya-sjsu opened this issue May 9, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@aditya-sjsu
Copy link

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the OpenHouse community.

OpenHouse version

latest

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 20.0): WSL (Windows)
  • JDK version: 1.8.0_402

Describe the problem

I am trying to setup openhouse locally, but while running the update table REST API, I am receiving errors.
I am following the steps mentioned in https://github.com/linkedin/openhouse/blob/main/SETUP.md

  1. Create table request:
curl "${curlArgs[@]}" -XPOST http://localhost:8000/v1/databases/d3/tables/ \
--data-raw '{
  "tableId": "t1",
  "databaseId": "d3",
  "baseTableVersion": "INITIAL_VERSION",
  "clusterId": "LocalFSCluster",
  "schema": "{\"type\": \"struct\", \"fields\": [{\"id\": 1,\"required\": true,\"name\": \"id\",\"type\": \"string\"},{\"id\": 2,\"required\": true,\"name\": \"name\",\"type\": \"string\"},{\"id\": 3,\"required\": true,\"name\": \"ts\",\"type\": \"timestamp\"}]}",
  "timePartitioning": {
    "columnName": "ts",
    "granularity": "HOUR"
  },
  "clustering": [
    {
      "columnName": "name"
    }
  ],
  "tableProperties": {
    "key": "value"
  }
}'
  1. Create table response
{
    "tableId": "t1",
    "databaseId": "d3",
    "clusterId": "LocalFSCluster",
    "tableUri": "LocalFSCluster.d3.t1",
    "tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
    "tableLocation": "file:/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
    "tableVersion": "INITIAL_VERSION",
    "tableCreator": "DUMMY_ANONYMOUS_USER",
    "schema": "{\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"id\",\"required\":true,\"type\":\"string\"},{\"id\":2,\"name\":\"name\",\"required\":true,\"type\":\"string\"},{\"id\":3,\"name\":\"ts\",\"required\":true,\"type\":\"timestamp\"}]}",
    "lastModifiedTime": 1715285373822,
    "creationTime": 1715285373822,
    "tableProperties": {
        "policies": "",
        "write.metadata.delete-after-commit.enabled": "true",
        "openhouse.tableId": "t1",
        "openhouse.clusterId": "LocalFSCluster",
        "openhouse.lastModifiedTime": "1715285373822",
        "openhouse.tableVersion": "INITIAL_VERSION",
        "openhouse.creationTime": "1715285373822",
        "openhouse.tableUri": "LocalFSCluster.d3.t1",
        "write.format.default": "orc",
        "write.metadata.previous-versions-max": "28",
        "openhouse.databaseId": "d3",
        "openhouse.tableType": "PRIMARY_TABLE",
        "openhouse.tableLocation": "/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
        "openhouse.tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
        "key": "value",
        "openhouse.tableCreator": "DUMMY_ANONYMOUS_USER"
    },
    "timePartitioning": {
        "columnName": "ts",
        "granularity": "HOUR"
    },
    "clustering": [
        {
            "columnName": "name",
            "transform": null
        }
    ],
    "policies": null,
    "tableType": "PRIMARY_TABLE"
}
  1. GET table request
curl "${curlArgs[@]}" -XGET http://localhost:8000/v1/databases/d3/tables/t1
  1. GET table response
{
    "tableId": "t1",
    "databaseId": "d3",
    "clusterId": "LocalFSCluster",
    "tableUri": "LocalFSCluster.d3.t1",
    "tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
    "tableLocation": "file:/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
    "tableVersion": "INITIAL_VERSION",
    "tableCreator": "DUMMY_ANONYMOUS_USER",
    "schema": "{\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"id\",\"required\":true,\"type\":\"string\"},{\"id\":2,\"name\":\"name\",\"required\":true,\"type\":\"string\"},{\"id\":3,\"name\":\"ts\",\"required\":true,\"type\":\"timestamp\"}]}",
    "lastModifiedTime": 1715285373822,
    "creationTime": 1715285373822,
    "tableProperties": {
        "policies": "",
        "write.metadata.delete-after-commit.enabled": "true",
        "openhouse.tableId": "t1",
        "openhouse.clusterId": "LocalFSCluster",
        "openhouse.lastModifiedTime": "1715285373822",
        "openhouse.tableVersion": "INITIAL_VERSION",
        "openhouse.creationTime": "1715285373822",
        "openhouse.tableUri": "LocalFSCluster.d3.t1",
        "write.format.default": "orc",
        "write.metadata.previous-versions-max": "28",
        "openhouse.databaseId": "d3",
        "openhouse.tableType": "PRIMARY_TABLE",
        "openhouse.tableLocation": "/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
        "openhouse.tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
        "key": "value",
        "openhouse.tableCreator": "DUMMY_ANONYMOUS_USER"
    },
    "timePartitioning": {
        "columnName": "ts",
        "granularity": "HOUR"
    },
    "clustering": [
        {
            "columnName": "name",
            "transform": null
        }
    ],
    "policies": null,
    "tableType": "PRIMARY_TABLE"
}
  1. Update table request
curl "${curlArgs[@]}" -XPUT http://localhost:8000/v1/databases/d3/tables/t1 \
--data-raw '{
  "tableId": "t1",
  "databaseId": "d3",
  "baseTableVersion":"INITIAL_VERSION",
  "clusterId": "LocalFSCluster",
  "schema": "{\"type\": \"struct\", \"fields\": [{\"id\": 1,\"required\": true,\"name\": \"id\",\"type\": \"string\"},{\"id\": 2,\"required\": true,\"name\": \"name\",\"type\": \"string\"},{\"id\": 3,\"required\": true,\"name\": \"ts\",\"type\": \"timestamp\"}, {\"id\": 4,\"required\": true,\"name\": \"country\",\"type\": \"string\"}]}",
  "timePartitioning": {
    "columnName": "ts",
    "granularity": "HOUR"
  },
  "clustering": [
    {
      "columnName": "name"
    }
  ],
  "tableProperties": {
    "key": "value"
  }
}'
  1. Update table response
{
  "status": "CONFLICT",
  "error": "Conflict",
  "message": "Entity with key[LocalFSCluster.d3.t1] is modified by another process already, nested exception message: Conflict detected for databaseId: d3, tableId: t1, expected version: /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json actual version INITIAL_VERSION: The requested user table has been modified/created by other processes.",
  "stacktrace": null,
  "cause": "Conflict detected for databaseId: d3, tableId: t1, expected version: /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json actual version INITIAL_VERSION: The requested user table has been modified/created by other processes.",
}

Stacktrace, metrics and logs

No response

Code to reproduce bug

No response

What component does this bug affect?

  • Table Service: This is the RESTful catalog service that stores table metadata. :services:tables
  • Jobs Service: This is the job orchestrator that submits data services for table maintenance. :services:jobs
  • Data Services: This is the jobs that performs table maintenance. apps:spark
  • Iceberg internal catalog: This is the internal Iceberg catalog for OpenHouse Catalog Service. :iceberg:openhouse
  • Spark Client Integration: This is the Apache Spark integration for OpenHouse catalog. :integration:spark
  • Documentation: This is the documentation for OpenHouse. docs
  • Local Docker: This is the local Docker environment for OpenHouse. infra/recipes/docker-compose
  • Other: Please specify the component.
@aditya-sjsu aditya-sjsu added the bug Something isn't working label May 9, 2024
@divyamsavsaviya
Copy link

I created a table called 'table10' and tried to update it with "baseTableVersion": "INITIAL_VERSION" and "clusterId": "LocalFSCluster", which I got from the successful table creation response. The update failed due to a conflict, suggesting the table had been modified by another process. However, when I retried after about 4 hours, the update worked.

@HotSushi
Copy link
Collaborator

Hi @aditya-sjsu , in your update request, the baseTableVersion is still pointing to INITIAL_VERSION, can you change it to /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json, and try again ?

OH table versions are used for atomic updates. Each change/update targets a specific version. If the version in HTS has evolved from the specified version you'll see the error "Entity with <> is modified by another process already"

An example update scenario is as follows:

# Action    # targetVersion    # versionAfterUpdate
CREATE_TABLE    INITIAL_VERSION    TBL_LOC_1
UPDATE_TABLE_1    TBL_LOC_1    TBL_LOC_2
INSERT_DATA    TBL_LOC_2    TBL_LOC_3
and so on.

Let me know if you still face this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants