Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEZ-4547: Add Tez AM JobID to the JobConf #339

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

VenkatSNarayanan
Copy link

Some committers require a job-wide UUID to function correctly. Adding the AM JobID to the JobConf
will allow applications to pass that to
the committers that need it.

Some committers require a job-wide UUID to function
correctly. Adding the AM JobID to the JobConf
will allow applications to pass that to
the committers that need it.
@tez-yetus

This comment was marked as outdated.

@tez-yetus

This comment was marked as outdated.

@@ -417,6 +418,7 @@ protected List<Event> initializeBase() throws IOException, InterruptedException
.createMockTaskAttemptID(getContext().getApplicationId().getClusterTimestamp(),
getContext().getTaskVertexIndex(), getContext().getApplicationId().getId(),
getContext().getTaskIndex(), getContext().getTaskAttemptNumber(), isMapperOutput);
jobConf.set(MRJobConfig.MR_PARENT_JOB_ID, new JobID(String.valueOf(getContext().getApplicationId().getClusterTimestamp()), getContext().getApplicationId().getId()).toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to line above TaskAttemptID taskAttemptId =

Comment on lines 151 to 152
assertNotEquals(parentJobID,invalidJobID);
assertNotEquals(output.jobConf.get(org.apache.hadoop.mapred.JobContext.TASK_ATTEMPT_ID),parentJobID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix code formatting. space after ,

Copy link
Contributor

@shameersss1 shameersss1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correct, Hive/Pig would use the value from mapreduce.parent.job.id to set the correct committer UUID right?

@@ -119,6 +119,7 @@ public void abortOutput(VertexStatus.State finalState) throws IOException {
|| jobConf.getBoolean("mapred.mapper.new-api", false)) {
newApiCommitter = true;
}
jobConf.set(MRJobConfig.MR_PARENT_JOB_ID, new org.apache.hadoop.mapred.JobID(String.valueOf(getContext().getApplicationId().getClusterTimestamp()), getContext().getApplicationId().getId()).toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have String.valueOf(getContext().getApplicationId().getClusterTimestamp()), getContext().getApplicationId().getId()).toString( inside a method and re-use it in MROutput.java as well ?

@VenkatSNarayanan
Copy link
Author

If my understanding is correct, Hive/Pig would use the value from mapreduce.parent.job.id to set the correct committer UUID right?

Yes, that was the plan. The property name was just chosen arbitrarily so I could put the PR up, any suggestions for a better one are welcome.

This commit also adds the DAG identifier to the job UUID
to ensure that multiple jobs within the same session will
be assigned different UUIDs.
@tez-yetus

This comment was marked as outdated.

@tez-yetus

This comment was marked as outdated.

@VenkatSNarayanan
Copy link
Author

@shameersss1 We could actually just set fs.s3a.committer.uuid directly instead of the indirection through the other setting.

Switch UUID property name to the
one required by S3A committers.
@tez-yetus

This comment was marked as outdated.

Copy link
Contributor

@shameersss1 shameersss1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@shameersss1
Copy link
Contributor

@abstractdog - Could you please review the same ?

Refactors the implementation to reuse Tez's
DAGID type instead of hand-rolling our own.
@tez-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 26m 4s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+0 🆗 mvndep 6m 25s Maven dependency ordering for branch
+1 💚 mvninstall 12m 58s master passed
+1 💚 compile 1m 54s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 compile 1m 46s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 checkstyle 2m 8s master passed
+1 💚 javadoc 1m 44s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javadoc 1m 29s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+0 🆗 spotbugs 1m 20s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 50s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 1m 9s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javac 1m 18s the patch passed
+1 💚 compile 1m 5s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 javac 1m 5s the patch passed
-0 ⚠️ checkstyle 0m 12s tez-api: The patch generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16)
-0 ⚠️ checkstyle 0m 18s tez-mapreduce: The patch generated 2 new + 368 unchanged - 0 fixed = 370 total (was 368)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javadoc 0m 51s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 findbugs 3m 5s the patch passed
_ Other Tests _
+1 💚 unit 2m 18s tez-api in the patch passed.
+1 💚 unit 1m 23s tez-mapreduce in the patch passed.
+1 💚 unit 4m 58s tez-dag in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
78m 42s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/6/artifact/out/Dockerfile
GITHUB PR #339
JIRA Issue TEZ-4547
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux efb9ed2193d6 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / f080031
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/6/artifact/out/diff-checkstyle-tez-api.txt
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/6/artifact/out/diff-checkstyle-tez-mapreduce.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/6/testReport/
Max. process+thread count 492 (vs. ulimit of 5500)
modules C: tez-api tez-mapreduce tez-dag U: .
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/6/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@VenkatSNarayanan
Copy link
Author

@abstractdog @shameersss1 Is there anything else needed?

@tez-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 27m 44s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+0 🆗 mvndep 5m 31s Maven dependency ordering for branch
+1 💚 mvninstall 10m 37s master passed
+1 💚 compile 1m 55s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 compile 1m 44s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 checkstyle 1m 59s master passed
+1 💚 javadoc 1m 43s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javadoc 1m 31s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+0 🆗 spotbugs 1m 20s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 46s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 1m 9s the patch passed
+1 💚 compile 1m 17s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javac 1m 17s the patch passed
+1 💚 compile 1m 6s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 javac 1m 6s the patch passed
-0 ⚠️ checkstyle 0m 12s tez-api: The patch generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16)
-0 ⚠️ checkstyle 0m 18s tez-mapreduce: The patch generated 2 new + 368 unchanged - 0 fixed = 370 total (was 368)
-0 ⚠️ checkstyle 0m 27s tez-dag: The patch generated 1 new + 281 unchanged - 0 fixed = 282 total (was 281)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
+1 💚 javadoc 0m 53s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
+1 💚 findbugs 3m 3s the patch passed
_ Other Tests _
+1 💚 unit 2m 15s tez-api in the patch passed.
+1 💚 unit 1m 24s tez-mapreduce in the patch passed.
+1 💚 unit 5m 2s tez-dag in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
76m 54s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/artifact/out/Dockerfile
GITHUB PR #339
JIRA Issue TEZ-4547
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux 72ee90ea8cae 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / 38c5aac
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/artifact/out/diff-checkstyle-tez-api.txt
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/artifact/out/diff-checkstyle-tez-mapreduce.txt
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/artifact/out/diff-checkstyle-tez-dag.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/testReport/
Max. process+thread count 410 (vs. ulimit of 5500)
modules C: tez-api tez-mapreduce tez-dag U: .
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-339/7/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants