Skip to content

apache/nutch-webapp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Apache Nutch WebApp README

For the latest information about Nutch, please visit our website at:

https://nutch.apache.org/

and our wiki, at:

https://cwiki.apache.org/confluence/display/NUTCH/Home

Introduction

The Nutch WebApp is built using the Apache Wicket Java web framework and Spring.

Running locally

N.B. Currently, you must have a running Nutch REST Server on the same host.

You can easily run the WebApp by executing the following

% mvn jetty:run

If you want to run the WebApp in a Jakarta Servlet container i.e. Apache Tomcat, then run the following

% mvn clean install -DskipTests
5 cp target/nutch-webapp-1.0-SNAPSHOT.war $CATALINA_HOME/webapps

You can then access the WebApp on the Tomcat host on port 8080.

Contributing

To contribute a patch, follow these instructions (note that installing Hub is not strictly required, but is recommended).

0. Download and install hub.github.com
1. File JIRA issue for your fix at https://issues.apache.org/jira/projects/NUTCH/issues
- you will get issue id NUTCH-xxx where xxx is the issue ID.
2. git clone https://github.com/apache/nutch-webapp.git
3. cd nutch-webapp
4. git checkout -b NUTCH-xxx
5. edit files (please try and include a test case if possible)
6. git status (make sure it shows what files you expected to edit)
7. Make sure that your code complies with the [Nutch codeformatting template](https://raw.githubusercontent.com/apache/nutch/master/eclipse-codeformat.xml), which is basially two space indents
8. git add <files>
9. git commit -m “fix for NUTCH-xxx contributed by <your username>”
10. git fork
11. git push -u <your git username> NUTCH-xxx
12. git pull-request

IDE setup

Generate Eclipse project files

mvn eclipse:eclipse

and follow the instructions in Importing existing projects.

IntelliJ IDEA users can also import Eclipse projects using the "Eclipser" pluginhttps://plugins.jetbrains.com/plugin/7153-eclipser), see also Importing Eclipse Projects into IntelliJ IDEA.

About

Apache Nutch is an extensible and scalable web crawler

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published