When a client orders a Drupal website from Omitsis, it’s quite common that it’s an existing site, which means they have content they want to migrate.
Drupal has the migrate module, which is part of the core and lets us migrate content. It’s a powerful, versatile module that supports rollbacks out of the box. Rollbacks let us “un-import”, which is super useful when we’re still in development and there are errors.
The migrate module lets us import from different sources: a Drupal 7 site, a database with any structure, a csv, JSON, XML, SOAP, etc.
It’s very common for clients to send us an Excel file with all the fields to migrate for each content type. In this post we’re going to explain how to do this kind of migration: from a CSV, since every spreadsheet (Excel for example) lets you save as CSV and it’s a standard, well-supported format.
Everything we’re going to explain here is valid for modern Drupal. When I say “modern Drupal” I mean Drupal 8 and later (9, 10, 11…), because starting from version 8 things changed quite a bit with the adoption of Symfony, and upgrades between major versions became much more manageable. These days, if you’re starting a new project, you should be on Drupal 10 or Drupal 11. Drupal 7, 8 and 9 are no longer supported, so if you still have something running on those… well, you’ve got another item for your to-do list.
The first thing to do is install the modules we’re going to need. Migrate is already in Drupal core, but we’ll need a few more: migrate_plus, migrate_tools and migrate_source_csv.
We install these modules with composer:
composer require drupal/migrate_plus drupal/migrate_tools drupal/migrate_source_csvAnd we enable them, with drush it would be like this:
drush -y en migrate migrate_plus migrate_tools migrate_source_csvNow we need to create the YAML files that will define the migrations. We have to do this inside a custom module, which we might already have, or we can create one specifically for the migration (which is the cleaner option).
You can easily generate a module with drush. In our case we’ll call it custom_migrate.
drush generate moduleInside this module’s directory we create a subfolder called config and inside that one another called install. That’s where we’ll put the YAML files that define the migrations.
The naming convention for these files has to be:
migrate_plus.migration.[MIGRATION_ID].ymlFor example, if we want to import products, the file name could be:
migrate_plus.migration.products.ymlIn this example the file would be at the following path:
web/modules/custom/custom_migrate/config/install/migrate_plus.migration.products.ymlThe migrate_plus module also lets us create groups, so we can run imports (and rollbacks) in batches. You can check out this post to see how. We won’t be using them in this example, but they’re pretty handy.
Now we need one more module: config_devel. This one allows Drupal to re-read configuration files whenever we ask it to, without having to install and uninstall the module every time we change something. That said, config_devel is for development environments only, don’t leave it enabled in production.
To install it:
composer require --dev drupal/config_develAnd to enable it:
drush -y en config_develNow we have to tell Drupal to take into account the import file we just defined. That goes in our module’s .info.yml file. The whole file would look like this:
name: 'Custom migrate'
type: module
description: 'Content import module'
core_version_requirement: ^10 || ^11
package: 'Custom'
dependencies:
- migrate
- migrate_plus
- migrate_tools
- config_devel
config_devel:
install:
- migrate_plus.migration.products
# - migrate_plus.migration.another_import
Now the most important part: the content of the import YAML. This could fill many posts, so we’re going to give a simple example:
First we need to define the id, and a label is good to give it a nice name.
id: products
label: Import productsNext comes the source section. Here we have to tell it which plugin we’re using, and for the CSV case, the path where the CSV is. In this case we’ve put it inside the module so we can integrate it into our git.
We’ve also indicated the unique ids, which in our CSV is called «code».
And finally, we list the CSV columns.
source:
plugin: 'csv'
# Full path to the file.
path: 'modules/custom/custom_migration/data/products.csv'
# Column delimiter. Comma (,) by default.
delimiter: ','
# Field enclosure. Double quotation marks (") by default.
enclosure: '"'
# The row to be used as the CSV header (indexed from 0),
# or null if there is no header row.
header_offset: 0
# The column(s) to use as a key. Each column specified will
# create an index in the migration table and too many columns
# may throw an index size error.
ids:
- code
# Here we identify the columns of interest in the source file.
# Each numeric key is the 0-based index of the column.
# For each column, the key below is the field name assigned to
# the data on import, to be used in field mappings below.
# The label value is a user-friendly string for display by the
# migration UI.
fields:
0:
name: cat
label: 'Category'
1:
name: code
label: 'Product code'
2:
name: title
label: 'Title'
3:
name: image
label: 'Image'Now the process part, where we tell it where the data should go, the plugin we’re using, and the source.
If it’s something very simple, it could be as easy as this:
field_name_in_our_drupal: source_column_name
But often it’s not that simple. For example, the title comes in all caps, so we combine several plugins. You’ll also see the part for importing into a media field. We’ll explain that one in another post soon.
A real-world case could be this:
process:
type:
plugin: default_value
default_value: producto
field_cat:
plugin: entity_lookup
source: cat
value_key: name
bundle_key: vid
bundle: cat_producto
entity_type: taxonomy_term
ignore_case: true
field_producto_codigo: code
title:
-
source: title
plugin: callback
callable: mb_strtolower
-
plugin: callback
callable: ucfirst
field_imagen/target_id:
plugin: entity_lookup
value_key: name
source: image
bundle_key: bundle
bundle: image
entity_type: media
ignore_case: 1
access_check: 0There are lots of plugins available, you can see them here. We can also create plugins quite easily.
The only thing left is to tell it where all this content is going to go using destination. It depends on the entity we’re importing, in this case nodes.
destination:
plugin: entity:nodeAnd here’s the whole thing:
id: products
label: Import products
source:
plugin: 'csv'
# Full path to the file.
path: 'modules/custom/custom_migration/data/products.csv'
# Column delimiter. Comma (,) by default.
delimiter: ','
# Field enclosure. Double quotation marks (") by default.
enclosure: '"'
# The row to be used as the CSV header (indexed from 0),
# or null if there is no header row.
header_offset: 0
# The column(s) to use as a key. Each column specified will
# create an index in the migration table and too many columns
# may throw an index size error.
ids:
- code
# Here we identify the columns of interest in the source file.
# Each numeric key is the 0-based index of the column.
# For each column, the key below is the field name assigned to
# the data on import, to be used in field mappings below.
# The label value is a user-friendly string for display by the
# migration UI.
fields:
0:
name: cat
label: 'Category'
1:
name: code
label: 'Product code'
2:
name: title
label: 'Title'
3:
name: image
label: 'Image'
process:
field_cat:
plugin: entity_lookup
source: cat
value_key: name
bundle_key: vid
bundle: cat_producto
entity_type: taxonomy_term
ignore_case: true
field_producto_codigo: code
title:
-
source: title
plugin: callback
callable: mb_strtolower
-
plugin: callback
callable: ucfirst
type:
plugin: default_value
default_value: producto
field_imagen/target_id:
plugin: entity_lookup
value_key: name
source: image
bundle_key: bundle
bundle: image
entity_type: media
ignore_case: 1
access_check: 0
destination:
plugin: entity:node
migration_dependencies: {}Now we just need to run the drush commands to perform the import. You can also do it from the UI at admin/structure/migrate.
Quick note before we keep going: I’m going to use the short aliases because those are the ones I use day to day and have built into my muscle memory. If you prefer the long form, here’s the equivalence:
| Short alias | Long form |
|---|---|
drush mim | drush migrate:import |
drush mr | drush migrate:rollback |
drush ms | drush migrate:status |
drush mrs | drush migrate:reset-status |
drush cdi | drush config:devel-import |
drush en | drush pm:install |
The first thing is to tell Drupal to load all the import files using the config_devel module:
drush cdi [module_name]Which in our case would be:
drush cdi custom_migrateRemember that the YAML has to be in config/install AND listed in the config_devel: section of the .info.yml. If either of those is missing, drush cdi won’t load it.
And write this down somewhere because you’re going to trip over it: every time you change the migration YAML, you have to run drush cdi again. Otherwise the changes won’t apply and you’ll go crazy trying to figure out why the migration is still doing the same thing as before. I’m telling you from experience.
Then we can see the migrations we have and their status with:
drush msTo run the import:
drush mim [import_id]Which in our case would be:
drush mim productsIf something goes wrong, which is the norm, we can do a rollback with this command:
drush mr productsIt’s also possible that the migration gets stuck, in which case before doing the rollback we have to set it to idle:
drush mrs productsA few flags worth knowing
Once you’ve got a few migrations under your belt, you’ll find yourself using the same four flags over and over. I’m putting them here so you don’t have to discover them the hard way like I did.
--limit
To test the migration with a few rows before running it on the whole thing. If your CSV has 50,000 rows and you suspect something’s going to fail, don’t dive in head first:
drush mim products --limit=10--update
Re-imports records that were already imported. Useful when you change something in the process and want it to apply to what’s already in the database too, without having to do a full rollback:
drush mim products --update--idlist
To re-import (or import for the first time) a specific id or several. Perfect when you’re debugging a weird case and want to focus on a specific row:
drush mim products --idlist=ABC123
drush mim products --idlist=ABC123,DEF456--sync
Syncs the destination with the source: it imports new records, updates existing ones, and deletes from the destination anything that’s no longer in the source. Be careful with this one, because if you accidentally delete something from the CSV you’ll wipe out content in Drupal. But if that’s exactly what you want, it saves you a lot of hassle:
drush mim products --syncAnd that’s pretty much it. This isn’t meant to be an exhaustive guide to the migrate module — that would fill many posts (importing images to media, paragraphs, taxonomies with hierarchy, custom plugins…) — but as an introduction to running a migration from a CSV, I think it covers the basics.
If it helped you out and saved you a couple of hours of banging your head against the keyboard, great. And if you share it with someone who’s just starting out with migrations, even better.