Google I/O 2009, Day 2
Photo taken by Ms. Jen with her Nokia N95.
Here is my transcription of two sessions from Day 2, 05.28.09, of the Google I/O 2009. Per my usual, the following is a combination of live quotes from the speaker, notes off the slides, some paraphrase and a few of my own asides.
Offline Processing on App Engine: A Look Ahead
*GAE is great for webs apps
Request based, database backed apps
Do work continuously w/o user requests
Incrementally process data, compute results
Smooth out load patterns, lower user latency
*New style of computation on App Engine
New API for App Engine Overview
* New API for App Engine: Task Queue
* Part of App Engine Labs: API may change until it's graduated from Labs, Not yet specified how we will enable billing
* Not released should launch in a couple of weeks
* Live for demoing today with working code
What is a task queue?
* Simple idea in general:
1. Describe the work you want to do now
2. Save the desc somewhere
3. Have something else execute the work later
* Work executed in the order received
* If execution fails, worlk will be retired until successful
Asynchronous: Why do work now when we can do it later?
Low latency for users: Tasks are light weight: ~3x faster than Datastore
Reliable: Once written, a task will eventually complete
Scalable: Stoare of new tasks has no contention
Parallelizable with multiple workers.
*Many features can extend this basic concept
What is a queue historically?
*Unix had at and batch commands
*People use cron jobs and flat files to append to a DB or file with work to do
Other task systems out there:
* There are many task queue like systems out there: *MQ, Amazon SQS, Azure queues, TheSchwartz, Twisted, Starling, etc
Task queues are conflated with publish-subscribe messaging
Queueing systems maximize data throughput: Routers, data pipelines, fully saturate network
* Pub-sbu systems maximize transactions, decoupling: large numbers of small transactions per second, one to many fan out with changing receivers, guaranteed ordering, filtering, two phase commit
* Our new API implements queueing, not pub-sub
How do traditional task queues work?
Workers always running
* Polling has problems:
Working sits in a loop polling the front of the queue: not event driven; wasted work
Workers stay resident when there is no work to do, wastes resources
Fixed number of works
*Limited optimization possible
How is GAE task queue different?
* We PUSH tasks to your app; no polling necessary
* HTTP Web hooks
RESTful, pushed based interface for doing work
Concept used outside Google & GAE
Many of our upcoming APIs use this style
* Tasks as web hooks
is just an HTTP request (URL, body, etc)
Enqueue and we send you app the request later
If the web hoot returns HTTP 200 OK, it is done
*Concrete example: Mail sending queue (shows python code)
Shows diagram on how Task Queue API work : Automatically adds worker threads based on load
* Worker threads added depending on work-load
Max # of threads depends on throughput, high maximum rate limits for safety
*Integrated into admin console as normal requests: Application and request logs searchable, dashboard stats and error rate monitoring, graphs include offline work
Working with tasks: Idempotence
* Important for tasks to be run repeatedly without harmful effects, why necessary? Failure can happen at anytime.
* Run the same task repeatedly w/o harmful effects
* Necessary because failure may happen at any time
* Tasks will be retired until successs=
* Possible for a task to spuriously run twice even w/o server failures.
* IT is your responsibility as the app dev to ensure idempotence of tasks
Working with Queues
* Each task added to a single Queue for execution: multiple queues allowed per application
* Queues provide isolation and separation of tasks
* Configure how each queue is throttled
* Example queue.yaml
- name: mail_queue
* Why do you want to throttle?
Combine work periodically; execute in batches
Ensure stability of workload
Not exceed maximum writes per second for a single entity group in datastore
Not overload a partner site
Shows many to many queue throttling diagram
How to do a schema migration in GAE?
Without Task Queue: cron job slowly iterates through entities, migrates them, stres current entity location in memcache
With Task Queue: define handler to: query for next N entities....
Shows example code
Brett goes so fast that I have lost track....
Regardless, this looks good. I am hungry, may go to lunch early
*Batch processing: TAsk API good for small datasets, More tools required for parallelization, high throughput processing of datastore entities, Need rich features for aggregations, stas
*Map Reduce: plan to eventually support MapReduce abstraction, need more tools: intermediary storage...
* Use task queue api
* Make your existing app faster lower latency
* Scale your app further with reduced costs
Looking Beyond the Screen: Text-To-Speech and Eyes-Free Interaction on Android
TV Raman, Charles L. Chen
This was a good session on touch screens and accessibility.
Project home: http://code.google.com/p/eyes-free/
Ms. Jen Note: Sorry, I didn't take notes, as I was sitting on the floor and could not see the screen and most of it was demo, but I could hear the reader reading the interactions from the Android phone to TV Raman (who is blind). The reader sounded very similar to the JAWS reader. The voice reading is to be clearer in the next iteration of Android (aka "Donut"), according to Mr. Raman.
Supporting Multiple Devices with One Binary
Ms. Jen Note: I am sitting on the floor next to a power source for this session and may or may not get the best notes.
Only write your app once
* OS versions : 1.0, 1.5, Donut
* Runtime configuration options: locales, orientation, operator, many more
* Different hardware