Migrate实战:从uchome迁移到drupal8

逆流の鱼, 17 八月, 2020

Drupal 迁移包括三个环节:source / process / destination,每个环节都分别可以对数据进行加工处理。

Source

drupal内核包含自带两个source plugin,Embed data和sql source。Embed data是在代码内直接键入数据,主要用于前期测试;sql source是从sql数据库读取数据。

Process

Process环节主要是指定源数据字段与目标数据字段的映射关系。

Destination

这个环节指定数据迁移到哪里,user?node?还是taxonomy?或其它entity。

uchome

uchome2.0是2009年前发布的,由康盛公司开发的免费社交平台(social network)软件,曾经流行了多年,大约2015年时停止开发,但直到现在还有些网站在使用。其数据结构主要是基于mysql数据表,这令制定migrate的sourc非常方便——按数据表的将数据提取出来就可以了。

虽然年代久远,所幸网上仍能找到uchome的数据结构说明

uchome的数据类型主要包括user,space,blog,mtag(话题),album,photo,post,以及各种comment,下面仅就photo一种举例说明。

准备工作

  1. 首先在drupal系统安装好migrate / migrate tools /migrate plus,添加photo内容类型以及image和comment字段。
  2. 服务器环境中配置好drush。

定制Source

drush gen migrate-source

利用drush generate命令自动生成migrate模块 / source plugin模板。

source目录下的php文件对应的每一个sql source,config \ install目录下的yml文件对应每一个数据源迁移配置。

对应photo的整个迁移,其实就两个文件,或加多一个迁移评论的文件。

  • photo.php —— 从uchome数据库的album和pic提取相片
  • migrate_plus.migration.uchome_migrate_photo.yml —— 迁移相片
  • migrate_plus.migration.uchome_migrate_comment_photo.yml —— 迁移相片下的评论

接下来要对生成的通用代码作修改,修改后的代码如下。

photo.php:
<?php
namespace Drupal\uchome_migrate\Plugin\migrate\source;

use Drupal\migrate\Plugin\migrate\source\SqlBase;
use Drupal\migrate\Row;

/**
 * Migrate Source plugin.
 *
 * @MigrateSource(
 *   id = "uchome_migrate_photo",
 *   source_module = "uchome_migrate",
 * )
 */
 class Photo extends SqlBase {

   /**
    * {@inheritdoc}
    */
   public function query() {
     /**
      * An important point to note is that your query *must* return a single row
      * for each item to be imported. Here we might be tempted to add a join to
      * migrate_example_beer_topic_node in our query, to pull in the
      * relationships to our categories. Doing this would cause the query to
      * return multiple rows for a given node, once per related value, thus
      * processing the same node multiple times, each time with only one of the
      * multiple values that should be imported. To avoid that, we simply query
      * the base node data here, and pull in the relationships in prepareRow()
      * below.
      */
     $query = $this->select('uchome_pic', 'pic');
     $query->join('uchome_album', 'album', 'pic.albumid = album.albumid');
     $query->condition('album.friend', 0, '=')
           ->condition('album.picnum', 0, '>')
           ->condition('album.uid', 1, '>')
           ->fields('pic', ['picid',
                            'albumid',
                            'uid',
                            'dateline',
                            'filename',
                            'title',
                            'filepath',
                            ])
           ->orderBy('pic.picid', 'ASC')
           ;
     return $query;
   }

   /**
    * {@inheritdoc}
    */
   public function fields() {
     $fields = [
       'picid' => $this->t('picid'),
       'albumid' => $this->t('albumid'),
       'uid' => $this->t('uid'),
       'dateline' => $this->t('dateline'),
       'filename' => $this->t('filename'),
       'title' => $this->t('title'),
       'filepath' => $this->t('filepath'),
     ];

     return $fields;
   }

   /**
    * {@inheritdoc}
    */
   public function getIds() {
     return [
       'picid' => [
         'type' => 'integer',
       ],
     ];
   }

   /**
    * {@inheritdoc}
    */
   public function prepareRow(Row $row) {
     $title = $row->getSourceProperty('title');
     $filename = $row->getSourceProperty('filename');
     if (!$title) {
       $row->setSourceProperty('title', $filename);
     }
     return parent::prepareRow($row);
   }

 }
 
migrate_plus.migration.uchome_migrate_photo.yml:
id: uchome_migrate_photo
label: Photo
migration_group: uchome
source:
  constants:
    DRUPAL_FILE_DIRECTORY: 'public://img/migrate/photo'
    SOURCE_DIRECTORY: 'YOUR SOUCE DIRECTORY OF ATTACHMENT'
  plugin: uchome_migrate_photo
  key: uchome
  # Enable "track changes" feature.
  track_changes: true

destination:
  plugin: 'entity:uchome_photo'
  #default_bundle: album

process:
  id: picid
  title: title
  uid: uid
  created: dateline
  changed: dateline
  source_path:
    plugin: concat
    delimiter: /
    source:
      - constants/SOURCE_DIRECTORY
      - filepath

  field_image:
    plugin: image_import
    source: '@source_path'
    destination: 'constants/DRUPAL_FILE_DIRECTORY'

  field_album/target_id: albumid

migration_dependencies:
  required:
    - uchome_migrate_album
  optional: []

dependencies:
  module:
    - uchome_migrate
  enforced:
    module:
      - uchome_migrate

 

migrate_plus.migration.uchome_migrate_comment_photo.yml:
# Migration configuration for uchome_migrate_user content.
id: uchome_migrate_comment_photo
label: Comment of Photo
migration_group: uchome

source:
  plugin: uchome_migrate_comment_photo
  key: uchome
  # Enable "track changes" feature.
  track_changes: true
  defaults:
    text_format: basic_html
    entity_type: uchome_photo
    field_name: field_comment
    status: 1

destination:
  # Specify the destination plugin (usually entity:entity_type).
  plugin: entity:comment
  default_bundle: photo_comment

process:
  cid: cid
  entity_id: id
  entity_type: defaults/entity_type
  field_name: defaults/field_name
  subject: subject
  'comment_body/value': body
  'comment_body/format': defaults/text_format
  uid: authorid
  created: dateline
  changed: dateline
  status: defaults/status

migration_dependencies:
  required:
    - uchome_migrate_photo
  optional: []

dependencies:
  module:
    - uchome_migrate
  enforced:
    module:
      - uchome_migrate

其它内容的迁移就是参照上面的文件作修改,重复一样的流程。

 

一些注意事项

1、迁移comment时,rollback会严重出错

$ drush mr uchome_migrate_comment_photo
 [error]  Error: Call to a member function toUrl() on null in rdf_comment_storage_load() (line 243 of /var/www/drupal8/web/core/modules/rdf/rdf.module) #0 /var/www/drupal8/web/core/lib/Drupal/Core/Entity/ContentEntityStorageBase.php(818): rdf_comment_storage_load(Array)

是drupal内核的rdf模块的问题,具体issue在这里:https://www.drupal.org/project/drupal/issues/2565247,预计要到drupal8.9版本以后才能修复。

2、migrate_file是个好东西,能够一步迁移实体(entity)的图片字段,不然的话,就要分两步(先迁移图片,再迁移实体)。但migrate_file有个小bug,就是定义文件导入时,源和目标都不支持drupal的文件路径(public://...)。官网上有个补丁( File import does not support (well) the local Uris )只修复了对目标路径的支持,复制源路径依旧不支持。目标路径只好用绝对路径代替,例子如下:

DRUPAL_FILE_DIRECTORY: 'public://img/migrate/photo'
    SOURCE_DIRECTORY: '/var/d8files/uchome/attachment'

3、由于有drush generate这个工具,每个内容实体我都是用模块生成,避免统统用node

drush gen content-entity

评论