GitHub - jstirk/silicon-beach-database: Best described as a distributed database using data portability technologies. People will markup their blogs, websites, or store their information where they want - and ping the central SiliconBeachAustralia.org server to create an aggregated database of people and companies.

This repository has been archived by the owner on Jan 29, 2019. It is now read-only.

jstirk / silicon-beach-database Public archive

Best described as a distributed database using data portability technologies. People will markup their blogs, websites, or store their information where they want - and ping the central SiliconBeachAustralia.org server to create an aggregated database of people and companies.

groups.google.com/group/siliconbeachaustraliadatabase

8 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
config		config
db		db
doc		doc
lib		lib
log		log
public		public
script		script
test		test
.gitignore		.gitignore
README		README
Rakefile		Rakefile

Repository files navigation

= Silicon Beach Australia Distributed Database Initiative

== DESCRIPTION

On Sunday 27th July, Elias Bizannes created the SiliconBeachAustralia.org google group. Within 24 hours, 83 messages had been posted by the 84 members. One week later, this rapidly expanding community had a new initiative identified. The evolution of the idea as initially proposed by Elias, can be best described as a distributed database using data portability technologies. People will markup their blogs, websites, or store their information where they want - and ping the central SiliconBeachAustralia.org server to create an aggregated database of people and companies.

Further details of the idea: http://groups.google.com/group/silicon-beach-australia/browse_thread/thread/4453a546137df6bf

== CONTRIBUTORS

Elias Bizannes
Wayne Meissner
Warren Seen
Jason Stirk

== OVERVIEW

The core features of the database include :
# Being able to periodically check pages registered with the system for hResume data and update the data stored in the database;
# Store the hResume data in a database format that facilitates easier searching, aggregation and the construction of a web API to allow other sites to consume the data;
# Allow authors to defer their resume to another site. That is, we should follow redirects when requesting content so as that authors may easily change their resume provider. Eg. http://griffin.oobleyboo.com/resume might redirect to my LinkedIn profile. We should handle that;

== TODO

* TESTS! Currently there are absolutely 0 tests run. This really needs to be fixed no that we have some idea of how this is going to work.
* Database should be moved from SQLite3 into MySQL, preferably using MyISAM tables in select places to allow us to do full-text indexing. Alternatively, we could use another database and table type, and push all the content into another indexing system.
* Consider merging Resume and Person models. It doesn't really make sense to allow a single person to have multiple resumes - they should have one authoritative document, not several separate documents. Resume model currently only handles minimal data about when it should be updated, the URI, etc. This could easily be merged in as part of Person.
* Consider normalizing Skills.value out to another table so as that we can look them up by ID, rather than text comparison.
* Skills are currently separated by DOS, Unix or Mac end-of-line sets, and by single comma (,) characters. Should we split these fields at all? Do we propose some sort of convention for this? Should we use semi-colon (;) rather than comma?
* Experience and Qualifications aren't cleaned out if the resume no longer refers to them. This should probably be done with some sort of timestamp - if we haven't seen the data for a few cycles, drop it out.

== CURRENT USAGE

A very basic website exists if you run script/server.

You can also play with the models via script/console :

r=Resume.new
r.uri='http://www.linkedin.com/in/jasonstirk'
r.save
r.update!

r.summary # => "...."

p=r.person # => #<Person ...>
r.person.given_name # => "Jason"
r.person.locality # => "New South Wales, Australia"
r.person.skills.size # => 11

r.person.experiences.size # => 5
e=r.person.experiences.first # => #<Experience>
e.organization.name # => 'Achernar Solutions'

Although my resume isn't a good example, you can also use methods like :

person.qualifications.size # => 3
person.qualifications.first # => #<Qualification>

Similarly :

org=Organization.find(:first)
org.students # => 1
org.employees # => 1

== PERIODIC UPDATES

Resume::update_all! can be called periodically which will update any Resume models that need to be updated. This method is intended to be safe, even if two running instances overlap - it shouldn't re-process updates that have just been done by another process.

Currently, Resumes are set to be updated once per week.