<aside> đĄ Hey fellow devs! I'm Kyohei, cofounder of a privacy tech startup called Hyde.
I made a Form Builder like Typeform where nobody can see your answers in clear text.
For the context, I have a pet peeve for OAuth that I don't want represent my authorization by giving away my data. Once you authorize on OAuth you have no way to be sure them not in unintended use nor passed to another party. âRevokingâ in terms of OAuth doesn't mean anything to data already given away.
And I have a solution for this: ****not providing data!
There's a strong assumption that data would have to be passed away to be processed. This had been traditionally valid for all the computation cannot happen anywhere else than "their server" where programs are deployed.
Our idea is to reverse this. You don't transmit and expose your data to them. They disclose their computation to you, by submitting a Docker image to us.
I made a proof of concept for this. You  can create a form just like Google Forms or Typeform. But you wouldn't be able to read  answers from respondents. To access the data, you need to have a Docker image. The image will be executed on our end, with JSON files of responses mounted there. It would be supposed to write results to a  specific file, from which we would read and reture via API.
Any feedback would be warmly welcome.
</aside>
<aside> đ© Contact me with @tnzk on Twitter or @tnzk:matrix.org.
</aside>
POST /api/0.3/forms/[slug]/analysis
to submit Docker image to be run. In the Postman, the image
is prepopulated with tnzk/hyde-form-example
. You can bring your own image. It returns a jobId
to query the status of the container.GET /api/0.3/forms/[slug]/analysis/[executionId]
to check the status of the container deployed. You can poll it with an interval it returns as waitInSec
field (please, please respect interval as this technical PoC does so naive load handling that you can kill us easily). Once after the container has completed, it returns word counts in total as the result.So far weâve examined how we can run the example Docker image to count words in all the responses. You can build a custom Docker images to implement more interesting analysis, such as words frequency, sentiment analysis or the like.
You can write Docker images as usual. The example weâve seen is implemented like:
import pandas as pd
import hyde
def stringify(v):
type = v['type']
f = {
'text': lambda: v['value'],
'number': lambda: str(v['value']),
'choices': lambda: ' '.join(v['value'])
}[type]
return f()
try:
word_count = 0
response_files = [name for name in hyde.listdir_input() if name.startswith('response-') and name.endswith('.json')]
for path in response_files:
json = hyde.read_json(path)
for answer in json['answers']:
s = stringify(answer)
word_count += len(s.split(' '))
result = { 'message': f"Responses have {word_count} words in total" }
hyde.write_json('result.json', result)
except BaseException as e:
hyde.write_json('error.json', {'error': repr(e)})
As you can see this is a pretty usual Python script, whose only non-usual part is the hyde
module. This provides several useful functionality to work on the survey runtime, while it is optional to use.
The points are:
/parcel/data/in
. Every response is provided as a JSON file. Here, hyde.listdir_input()
provides you a list of files in the input directory, as well as hyde.read_json(path)
allows you to read and parse it as JSON conveniently. You can do the same thing with os
module./parcel/data/out/result.json
for successful path or ./error.json
for failure path. Note that an exception means an error on the application level and its status would be Suceeded
whereas the status being Failure
implies some errors on the container level.CMD
to be []
and pass run.py
as a default which you can overwrite if you like to.Example code: