You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Etherpad.wikimedia.org"

From Wikitech-static
Jump to navigation Jump to search
imported>ArielGlenn
(→‎Deleting pads via site admin: added a more targeted deletion. the double wildcard was very slow)
imported>Alexandros Kosiaris
(4 intermediate revisions by 3 users not shown)
Line 1: Line 1:
'''etherpad.wikimedia.org'''.
+
'''https://etherpad.wikimedia.org'''
  
: '''Note''' etherpads are 100% public and open.  Anyone can read them.  "Obscure names" are never as obscure as you think and are NOT security.  Also the etherpad database is not suitable for any long-term storage — don't expect important data to stay there.
+
: '''Note:''' etherpads are 100% public and open.  Anyone can read them.  "Obscure names" are never as obscure as you think and are NOT security.  Also the etherpad database is not suitable for any long-term storage — don't expect important data to stay there.
=== Hardware ===
 
* on a etherpad1001, a VM on ganeti01.svc.eqiad.wmnet cluster
 
  
=== About ===
+
== Hardware ==
We built our own package dependant on our own nodejs packages. Everything is puppetized
+
Running on etherpad1002, a VM on ganeti01.svc.eqiad.wmnet cluster
 +
 
 +
== About ==
 +
We built our own package dependent on our own nodejs packages. Everything is puppetized.
  
 
The database that it uses is on ... just look this up in the puppet site manifest. Cluster m1 as of this writing.
 
The database that it uses is on ... just look this up in the puppet site manifest. Cluster m1 as of this writing.
  
The app runs on port 9000 and requests are reverse proxied by apache which also terminates SSL
+
The app runs on port 9000 and requests are reverse proxied by envoy which also terminates SSL.
  
The [[mw:Extension:EtherpadLite|EtherpadLite extension]] allows embedding it into wiki pages.
+
The [[mw:Extension:EtherpadLite|EtherpadLite extension]] (not currently used) allows embedding it into wiki pages.
  
=== Database layout ===
+
== Database layout ==
  
 
Etherpad-lite has decided to implement a key/value store on top of a RDBMS for some reason. Well it is an abstraction layer so they can work with other backends as well but it seems like the recommended option is an RDDBS (MySQL).  
 
Etherpad-lite has decided to implement a key/value store on top of a RDBMS for some reason. Well it is an abstraction layer so they can work with other backends as well but it seems like the recommended option is an RDDBS (MySQL).  
 +
 
http://etherpad.org/doc/v1.2.1/#index_database_structure seems to be the official documentation (version dependent obviously)
 
http://etherpad.org/doc/v1.2.1/#index_database_structure seems to be the official documentation (version dependent obviously)
  
=== Deleting pads via site admin ===
+
== Deleting pads via site admin ==
 +
:''To request a deletion, [https://phabricator.wikimedia.org/maniphest/task/edit/form/2/ file a security task on Phabricator].''
 +
A variety of ways exist (some are not available/do not work):
  
A variety of ways supposedly exist:
+
#Deletion through admin and a plugin(we do not have admin and users on purpose for now) so this is ruled out
 +
#Deletion through the API https://github.com/ether/etherpad-lite/wiki/HTTP-API (suggested method):
  
#Deletion through admin and a plugin(we do not have admin and users on purpose for now) so this is ruled out
+
:# Login to the etherpad host, at the moment, <code>etherpad1001.eqiad.wmnet</code>
#Deletion through the API https://github.com/ether/etherpad-lite/wiki/HTTP-API (we do not have users and admins on purpose for now so this is ruled out)
+
:# Search the API key created on etherpad first start, found on <code>/var/lib/etherpad-lite/APIKEY.txt</code>
 +
:# Call the deletion api:
 +
curl 'localhost:9001/api/1/deletePad?apikey=<api key gotten from the previous step>&padID=<pad name as used on the URI>'
 +
:# If everthings is ok, it should respond with <code>{"code":0,"message":"ok","data":null}</code>
 
# Deletion through the CLI https://github.com/ether/etherpad-lite/wiki/Getting-to-know-the-tools-in-bin. Supposedly this should work but it doesn't
 
# Deletion through the CLI https://github.com/ether/etherpad-lite/wiki/Getting-to-know-the-tools-in-bin. Supposedly this should work but it doesn't
#Deletion through the DB (this seems to be the only viable option at the time of this writing)
+
#Deletion through the DB (this seems to be the only alternative viable option to the API)
  
 
Suppose DELETEME is the pad id of the pad you want to remove (pad id can be taken from the url)
 
Suppose DELETEME is the pad id of the pad you want to remove (pad id can be taken from the url)
Line 41: Line 49:
 
</pre>
 
</pre>
  
=== how to list all pads ===
+
== How to list all pads ==
  
 
Two different plugins existed at the time of investigation, one was not installing correctly, one was not of any decent quality
 
Two different plugins existed at the time of investigation, one was not installing correctly, one was not of any decent quality
  
=== Mediawiki extension ===
+
== Mediawiki extension ==
  
 
Yes, don't we want to use that and embed in a wiki?
 
Yes, don't we want to use that and embed in a wiki?
Line 51: Line 59:
 
[https://www.mediawiki.org/wiki/Extension:EtherpadLite Extension:EtherpadLite]
 
[https://www.mediawiki.org/wiki/Extension:EtherpadLite Extension:EtherpadLite]
  
 +
== Converting etherpad content into wikitext ==
 +
 +
*[https://github.com/tbayer/etherpad2wiki Small Python script] to convert Etherpads into wiki pages - please help turn this into a Toolforge tool!
 +
 +
== Maintenance work ==
 +
 +
Building new debs whenever there are new releases/security patches is the main one here. However since this uses MariaDB misc, also have a look at [[MariaDB/misc]]
 
== See also ==
 
== See also ==
 
* http://etherpad.wikimedia.org
 
* http://etherpad.wikimedia.org
 
* [[m:Etherpad]]
 
* [[m:Etherpad]]
 
* [[mw:Etherpad Lite]]
 
* [[mw:Etherpad Lite]]
 +
* <cite class="citation journal">D'Angelo, Gabriele; Iorio, Angelo Di; Zacchiroli, Stefano (2018-11-03). [https://hal.inria.fr/hal-01882069 "Spacetime Characterization of Real-Time Collaborative Editing"]. ''Proceedings of the ACM on Human-Computer Interaction''. <b>2</b>.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.jtitle=Proceedings+of+the+ACM+on+Human-Computer+Interaction&rft.atitle=Spacetime+Characterization+of+Real-Time+Collaborative+Editing&rft.volume=2&rft.date=2018-11-03&rft.aulast=D%27Angelo&rft.aufirst=Gabriele&rft.au=Iorio%2C+Angelo+Di&rft.au=Zacchiroli%2C+Stefano&rft_id=https%3A%2F%2Fhal.inria.fr%2Fhal-01882069&rfr_id=info%3Asid%2Fen.wikipedia.org%3ASpecial%3AExpandTemplates" class="Z3988"></span> ("We [...] studied the full editing histories of about 14 000 textual documents (or pads, in EtherPad terminology) from http://etherpad.wikimedia.org/, which is one of the most popular public instances of Etherpad, hosted by the Wikimedia Foundation.")
  
 
{{lowercase}}
 
{{lowercase}}
 
[[Category:Services]]
 
[[Category:Services]]

Revision as of 08:11, 8 July 2020

https://etherpad.wikimedia.org

Note: etherpads are 100% public and open. Anyone can read them. "Obscure names" are never as obscure as you think and are NOT security. Also the etherpad database is not suitable for any long-term storage — don't expect important data to stay there.

Hardware

Running on etherpad1002, a VM on ganeti01.svc.eqiad.wmnet cluster

About

We built our own package dependent on our own nodejs packages. Everything is puppetized.

The database that it uses is on ... just look this up in the puppet site manifest. Cluster m1 as of this writing.

The app runs on port 9000 and requests are reverse proxied by envoy which also terminates SSL.

The EtherpadLite extension (not currently used) allows embedding it into wiki pages.

Database layout

Etherpad-lite has decided to implement a key/value store on top of a RDBMS for some reason. Well it is an abstraction layer so they can work with other backends as well but it seems like the recommended option is an RDDBS (MySQL).

http://etherpad.org/doc/v1.2.1/#index_database_structure seems to be the official documentation (version dependent obviously)

Deleting pads via site admin

To request a deletion, file a security task on Phabricator.

A variety of ways exist (some are not available/do not work):

  1. Deletion through admin and a plugin(we do not have admin and users on purpose for now) so this is ruled out
  2. Deletion through the API https://github.com/ether/etherpad-lite/wiki/HTTP-API (suggested method):
  1. Login to the etherpad host, at the moment, etherpad1001.eqiad.wmnet
  2. Search the API key created on etherpad first start, found on /var/lib/etherpad-lite/APIKEY.txt
  3. Call the deletion api:
curl 'localhost:9001/api/1/deletePad?apikey=<api key gotten from the previous step>&padID=<pad name as used on the URI>'
  1. If everthings is ok, it should respond with {"code":0,"message":"ok","data":null}
  1. Deletion through the CLI https://github.com/ether/etherpad-lite/wiki/Getting-to-know-the-tools-in-bin. Supposedly this should work but it doesn't
  2. Deletion through the DB (this seems to be the only alternative viable option to the API)

Suppose DELETEME is the pad id of the pad you want to remove (pad id can be taken from the url)

 delete from store where `key` like '%DELETEME%'; 

Note that I had good luck deleting pad content via the below, which tosses revisions, chats, and I don't know exactly what the pad2readonly bit is. This is a lot faster than the %DELETEME% query above, now that the db is so bloated.

 delete from store where `key` like 'pad:DELETEME%';
 delete from store where `key` like 'pad2readonly:DELETEME%';

How to list all pads

Two different plugins existed at the time of investigation, one was not installing correctly, one was not of any decent quality

Mediawiki extension

Yes, don't we want to use that and embed in a wiki?

Extension:EtherpadLite

Converting etherpad content into wikitext

  • Small Python script to convert Etherpads into wiki pages - please help turn this into a Toolforge tool!

Maintenance work

Building new debs whenever there are new releases/security patches is the main one here. However since this uses MariaDB misc, also have a look at MariaDB/misc

See also