Skip to content
This repository has been archived by the owner on Oct 22, 2021. It is now read-only.

409 Confict creating a new unique document. #103

Open
ElMicko opened this issue Apr 10, 2012 · 0 comments
Open

409 Confict creating a new unique document. #103

ElMicko opened this issue Apr 10, 2012 · 0 comments

Comments

@ElMicko
Copy link

ElMicko commented Apr 10, 2012

We are running a 5 server BigCouch cluster
{"couchdb":"Welcome","version":"1.1.1","bigcouch":"0.4.0"}

We came across a problem creating documents on a database with replicas.
n=3 q=64

I'm loading the database with documents generated out of another system, and doing so with 20 parallel processes.
We are seeing 409 Conflict HTTP status messages returned when generating a unique document that does not exist in the database.

What is confusing me is that we are receiving a conflict message, yet there is no document to conflict with. This is definitely the first time this document has been generated with this "_id". The 409 conflict seems to be internally generated, within BigCouch 0.4.0, and the 'new' document is being created. It can be retrieved afterwards, so we cant treat this as a write failure, even though, based on the HTTP status, we probably should.

In order to replicate the behaviour and isolate the problem, I've written simple bash script to reproduce.
(I wanted to be sure our application code was not in some way responsible for the issue)

We are generating unique identifiers for documents, and creating them in a scripted load, and logging the identifiers to a file so we can later ensure they are unique and not the cause of the conflict.

I'm trying to simulate a production load (bulk data load) so I have 12 servers, each running 10 copies of the bash document creation script.
each script creates 1000 documents. So we create 120,000 new and uniquely identified documents.
(the script requires curl and uuidgen)

Here is the test script

-- begin test script

Generate a temporary file to capture headers returned from couch

header_dump=tempfile
header_filename=echo $header_dump

for loopcounter in {1..1000}
do
#clear the header capture file
echo "" > $header_dump

#generate a uuid for this document
uuid=uuidgen -r

#capture the generated id so we can confirm its unique
echo $uuid >> couchload.out

#PUT a sample.json file to the cluster, capturing the HTTP headers returned
curl_status=`curl -X PUT --silent --data-binary @sample.json -H "Content-Type:application/json" --dump-header $header_dump http:\/\/couch-cluster-load-balancer\/couch-database\/$uuid` > /dev/null 2>&1

#Confirm that we received a 201 Created header, notify when we receive anything else. 
ok=`grep "201 Created" $header_dump  | wc -l`
if [ "$ok" -ne "1" ]; then
echo 'Did not return a 201 Created HTTP header ' $uuid
cat $header_dump
fi 

done

-- end test script

We are seeing the following output

Did not return a 201 Created HRRP header 32d8454a-1318-4117-886c-312ab08dfac5
HTTP/1.0 409 Conflict
Server: nginx/1.0.8
Date: Tue, 10 Apr 2012 04:40:39 GMT
Content-Type: text/plain;charset=utf-8
X-Couch-Request-ID: 7142c748
Content-Length: 58
Cache-Control: must-revalidate
X-Cache: MISS from localhost
X-Cache-Lookup: MISS from localhost:8000
Via: 1.0 localhost (squid/3.0.STABLE19)
Proxy-Connection: close

Showing that we are receiving this 409 Conflict HTTP Status even though the "_id" of the document did not conflict with anything that existis in the database. (I've confirmed that by searching the output.log file that the script generates

We are seeing 2 or 3 of these for every 120,000 documents created. We can work around this for now.

I was hoping that someone could confirm that this was either expected behaviour and perhaps point me to an explanation so we can understand what is going on.
Or is this something that might need to be looked at?

Cheers

Michael

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant