Jump to content

This is a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Fundraising/Data and flow/Queues

From Wikitech

Fundraising tech uses the PHP-Queue library to abstract queue access. In production we use Redis lists as queue storage.

We have two redis instances, one between payments wiki and civi and another between the donor / email preference center and civi.

Queues are used to decouple the payments frontend from the CiviCRM server. This is important for several reasonsː it allows us to continue accepting donations even if the backend servers are down, it keeps our private database more secure, and it enforces write-only communication from the payments cluster.

The main data flow is over the donations queue. Completed payment transactions are encoded as JSON and sent over the wire, to be consumed by the donations queue in wmf-civicrm and recorded in the CiviCRM database.

Another important queue is the pending queue, which pipes messages to the pending table. This holds information on each donation throughout the payment's progression. We are currently keeping all messages in this table and looking at a flag is_resolved to see if they have gotten to a final status.

However, if control is never returned, then pending db messages will sit around for some time in case the data will become useful again. After about 20 minutes, they become eligible for the pending transaction resolver. We attempt to complete settlement on these orders, and if successful, the completed message including pending-provided details is sent to the donations queue.

Various queue wrangling techniques are available.

Queues

Payments Flow
Queue Producers Consumers Description
donations payments, audit processors, IPN listeners, queue job runners DonationsQueueConsumer (queue2civicrm) Primary queue for incoming donations, written to whenever we learn about a successful payment.
recurring audit processors, queue job runners RecurringQueueConsumer (queue2civicrm) Information about changes in donor monthly subscriptions. Also contains information about individual recurring payments, but we should send those to donations .
refund PayPal queue job runner RefundQueueConsumer (queue2civicrm) incoming IPN notifications that the processor has completed a refund
pending payments PendingQueueConsumer (SmashPig) Donation methods which use an iframe (GC, Adyen) will leave a message on the pending queue before transferring UI flow to the processor. SmashPig pending queue consumer dumps this to temporary storage used by some IPN listeners and frontends. Messages will either be read from the db table after a fixed time, to complete settlement steps; or upon incoming notification of a status change; or will expire in a short amount of time.
payments-init payments PaymentsInitQueueConsumer (queue2civicrm) One entry for each transaction that leaves our flow control.
payments-antifraud payments AntifraudQueueConsumer (queue2civicrm) Data on transaction fraud scores
banner-history payments BannerHistoryQueueConsumer (queue2civicrm) Used to capture correlations between contribution tracking ID and banner history logging ID.
contribution_tracking payments, crm-queue2civi crm-queue2civi, analytics Hybrid write-only log and analytic store. This is a table in the drupal db, but we're working to turn this into a queue.
error logging (completion data) payments crm-audit Syslog, but also mined to reconstruct missing transaction data. TODO: standardize line format, use normalized data and not gateway XML.
banner-impressions wmf-varnish legacy-bannerimpressions-job TODO: Use Kafka client without filesystem layer
Payments Listeners
Queue Producers Consumers Description
jobs-adyen Adyen IPN listener Adyen queue job runner (SmashPig) Job types sent: DownloadReportJob, ProcessCaptureRequestJob, RecordCaptureJob. When we receive IPN messages confirming authorizations and captures of credit card payments or availability of new audit reports we send jobs to this queue. Note that similar scaling problems to the initial paypal queue job design exist here and we may have to proliferate this queue to jobs-adyen-N.
jobs-paypal PayPal IPN listener PayPal queue job runner (SmashPig) Job types sent: PayPal\Job. Originally this Job called the PayPal message verification endpoint, but we found that didn't scale and moved the verification back up to the initial IPN handler. Now this job just normalizes the IPN and drops it into one of the other queues.
jobs-gravy
Donor / Email Preferences
Queue Producers Consumers Description
unsubscribe unsubscribe page UnsubscribeQueueConsumer (queue2civicrm) The FundraisingEmailUnsubscribe Mediawiki extension allows donors to opt out of bulk mailings and sends these requests over a queue, to be consumed by the CRM.
opt-in opt-in page OptInQueueConsumer (queue2civicrm) The FundraisingEmailUnsubscribe Mediawiki extension also allows donors to opt in to bulk mailings and sends these requests over a queue, to be consumed by the CRM.
theres more now add them

Components

Component Operation Queue Description
DI-gateway-generic push payments-init Donation collection frontend, which can create successful donation messages.
push donations
push pending When processing becomes asynchronous or risky, a copy of the transaction is pushed to the temporary "pending" queue, usually waiting to integrate some response from a payment provider.
get sequence contribution-tracking (future) This is a table right now, and the sequence comes from an autoincrement. It's the last link making our frontend depend on the backend db.
push contribution_tracking (future) This is a table right now, and we write directly to it.
crm-audit, py-audit push donations, recurring, refund Nightly or weekly confirmation of payment events, in case our other methods of recording them fail.
crm-audit search by id error_logs (text files, not a queue) This funky feature is to pull otherwise unconsumed information about failed transactions back into memory, to gather context for donations we otherwise missed.
Adyen IPN listener push jobs-adyen
Adyen ProcessCaptureJob (SmashPig) push payments-antifraud
Adyen RecordCaptureJob (SmashPig) push donations
BannerHistoryQueueConsumer (queue2civicrm) pop banner-history Import banner history log-donation id correlations.
DonationsQueueConsumer (queue2civicrm) pop donations The main consumer is the queue2civi job, which reads from the donations queue and stores in our CRM database.
AntifraudQueueConsumer (queue2civicrm) pop payments-antifraud Import statistics
PaymentsInitQueueConsumer (queue2civicrm) pop payments-init Import statistics
PendingQueueConsumer (SmashPig) pop pending Reads messages from pending queue and dumps them in pending table in smashpig database.
RequeueDelayedMessages (SmashPig) push * SmashPig maintenance script to move messages from the damaged table back to their original queue if they have a retry_date in the past.
RecurringQueueConsumer (queue2civicrm) pop recurring Q2C module. Writes information about new or changed subscriptions to civicrm_contribution_recur and records payment tokens in civicrm_payment_token. Also records individual donations in a subscription, but we should leave that to DonationsQueueConsumer .
RefundQueueConsumer (queue2civicrm) pop refund Q2C module to record refund notifications, marking the affected contribution
UnsubscribeQueueConsumer (queue2civicrm) pop unsubscribe Decouples the mailing list unsubscribe UI from the CRM database.
DI-recurring-ingenico push payments-init Make monthly charges. TODO: Should we push to the donations queue as well, and even the recurring events stream if we failban?
analytics query by all indexes contribution_tracking TODO: should be decoupled from frontend event log


Performance

We have multiple queue related graphs here https://frmon.wikimedia.org/d/Pq1YNMviz/fundraising-overview?orgId=1&from=now-24h&to=now&timezone=utc&refresh=1m

Donations

Handles:

  • New one time donations from the frond end
  • Inital recurring donations from the front end
  • New one time donations from the audit
  • New recurrings from audits that arent paypal or fundraise up

Recurring

Handles:

  • All Paypal EC and legacy paypal recurring
  • New monthly converts
  • Something with UPI
  • Recurring upgrade
  • Autorescue

Pending

Payments Init

Contribution tracking

When a potential donor visits the Wikimedia donation page, a tracking record is created in the civicrm.civicrm_contribution_tracking table. This record includes the user's language, referrer, donation comment, opt-out status, a timestamp, and various other data. The tracking is handled on the MediaWiki side by the DonationInterface extension, which retrieves a contribution_tracking_id from a sequence generator in Redis and sends a message to the contribution-tracking Redis queue. In the queue2civicrm/contribution_tracking drupal module, a drush job (ctqc) uses the ContributionTrackingQueueConsumer to write messages to the database table

It also writes to a sister 'contribution_source' table, which is just an exploded version of the utm_source column. Since banners and Donatewiki concatenate 3 data points (banner, landing_page, and payment_method, separated by '.' characters) into the utm_source message field, the ContributionTrackingQueueConsumer splits them out and stores them separately in the contribution_source table for ease of querying. contribution_source has a foreign key to contribution_tracking_id.

If the user makes a successful donation, a contribution record is passed to CiviCRM via the donations queue. The queue2civicrm module then inserts the contribution record into the CiviCRM database and updates the contribution_tracking record with the id given to the contribution by CiviCRM by sending a message to the contribution-tracking queue with just the contribution id and the contribution_tracking_id.

More info on testing here