User:Multichill/Flickr ripper
Jump to navigation
Jump to search
A tool to upload large batches of Flickr files. Could be a local pywikipedia based bot or a web based bot.
Flow[edit]
- Fire up the tool and specify a set of files to work on
- See a global window and for each file a window.
- Fill out fields for files and press "queue" to add the file to the queue or "skip" to ignore the file
- File gets uploaded
Global and local[edit]
- The tool should have a global and a per file part.
- If a field is set in the global part and is set in the local part, the local field should be used.
- If a field is set in the global part and is left empty in the local part, the global field should be used.
- Would be nice to copy fields from previous/next file.
Fields[edit]
- Filename
- Somehow derive from flickr title, can of course be overridden.
- Check for name collisions
- Would be nice to have some automatic counter to not have duplicate filenames.
- Original source (could be set automaticly)
- Author(s) (link to flickr user page)
- Date of the work (Taken on "XX" uploaded to flickr on "XX"?)
- Take flickr upload date into account
- Take date in exif into account
- Description (langs) (pull from flickr)
- Other versions (less important, leave it out)
- Permission
- Tag for flickrreview
- Flickrreview it with a certain username
- Not free at flickr, need OTRS-permission
- Additional info: Other templates etc
- Licensing - Copy from flickr or override (needs otrs)
- Categories
- From tags
- From category auto completion tool
Would be nice to update one or more fields for a subset of images.
Sources[edit]
Existing tools to learn from:
- Special:Upload (main upload form)
- Commonplace
- Commonist
- Flickr upload bot - The standard upload bot. Makes uploading with all the correct info much easier.
- Flickr2Commons - This one is faster than the main Flickr upload bot (which can move really slowly occasionally and takes more clicks as it requires you to edit the image page), however a TUSC login is needed here and it adds a bunch of useless and redundant categories (but this one automatically adds geolocation information).
- FlickrLickr - A collaborative vetting process for selecting free (Creative Commons CC-BY) photos from Flickr, improving their metadata, and uploading them to Commons. No longer maintained and no new accounts will be created.
- Flinfo - For people who want to upload Flickr pictures by themselves.
Notes[edit]
- To find dupe calculate sha1 and compare it with wiki
- Flickr api is the best http://www.flickr.com/services/api/
- Python implementation : http://stuvel.eu/projects/flickrapi
- Useful queries:
- flickr.people.findByUsername - Get userid for username
- flickr.people.getPublicPhotos - to get all public images of a user http://www.flickr.com/services/api/explore/?method=flickr.groups.pools.getPhotos
- flickr.groups.search - find id for group
- flickr.groups.getInfo - get info about the group
- flickr.groups.pools.getPhotos - get photo's in a pool
- flickr.groups.pools.getContext - get previous and next image in a pool
- flickr.photos.getSizes - get links to different sizes (need original)
- flickr.photos.getInfo - get all info about an image
- flickr.photos.getExif - get exif info
- flickr.photos.geo.getLocation - get location, can be used directly in {{Location dec}}, zoom is an issue
- Geolocation
- Modes: Manual (1-10), semi-automatic (10-100), fully automated (100+)??