Handling HTML Markup with Drupal's Migrate API


  • Benji Fisher

In Drupal 8, we use the core Migrate API for

  • Upgrading Drupal 6 and Drupal 7 sites
  • Migrating sites from other systems to Drupal
  • Recurring imports from external systems (feeds)

It is a robust, flexible tool.

Drupal works best with structured data, and the Migrate API supports this: file attachments, related taxonomy terms, references to authors or other nodes, and so on. Along with the structured data, we also have to deal with blocks of text, and these blocks often contain HTML markup.

Until now, the Migrate API has supported basic processing of text fields using regular expressions. Marco Villegas and I contributed some plugins to the Migrate Plus module to support proper HTML parsing. This is easier to use and more reliable than using regular expressions.

We originally wrote these plugins while working for Isovera on a project for Pega Systems. Both Isovera and Pega have supported sharing these plugins with the Drupal community. I hope other developers will use them and give back some of their own plugins that use the same approach.

In this session you will

  • Get an overview of the MIgrate API in Drupal 8
  • Get an introduction to the new DOM-based plugins in Migrate Plus
  • Learn how to use the new plugins in your own migrations. (Demo time!)
  • See how to extend the framework with your own custom plugins

Slide on my GitLab Pages: https://benjifisher.gitlab.io/slide-decks/html-migrate-dcnj-2020.html

Attachment Size
Slides (PDF) 833.19 KB

Who Should Attend

  • Back-end Developers


  • Familiarity with HTML markup
  • Interest in the Migrate API