Hitachi Vantara Pentaho Community Forums
Results 1 to 12 of 12

Thread: Does mySQL bulk loader works only wth Local datafile

  1. #1
    Join Date
    Dec 2015
    Posts
    10

    Default Does mySQL bulk loader works only wth Local datafile

    Hello, I am trying to load a csv files from cloud storage to mySQL table using mySQL bulk loader. But, it keeps on failing with message"Only local files are supported at this time".
    PDI version 5.2.0.0. And Local check box is unchecked inside bulk loader step.

    Wanted to check if this feature only works for local datafiles. Any tricks to make it read from cloud storage?

    Thank you.
    Mallika

  2. #2
    Join Date
    Dec 2015
    Posts
    10

    Default

    Hi again.. if anyone has worked with mysql bulk loader sourcing data from cloud storage, pls advise how to do so. There is a checkbox as data is in Local or not, implies it is doable; but any other settings/permission etc I am missing?

    Thanks in advance.

    -Mallika

  3. #3
    Join Date
    Oct 2013
    Posts
    216

    Default

    If you have a file on cloud , first you can check it is accessible or not from cloud. try to use text file input and text file output with csv extension. This will check file is having permission or not.
    Then use MySQL Bulk Loader.
    -- NITIN --

  4. #4
    Join Date
    Dec 2015
    Posts
    10

    Default

    Hi Nitin, the file is accessible. If I use table output with same mysql table connection and same file as Text file input, it is working fine. But "mysql bulk loader" and "csv file input" this 2 components are showing message as "Only Local files are supported at this time". The mysql bulk loader is the one which we need to use.

    Thank you.
    Mallika

  5. #5
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Is this on Windows?
    MySQL Bulkloader doesn't work fully on Windows due to OS limitations around named pipes.

  6. #6
    Join Date
    Dec 2015
    Posts
    10

    Default

    OS is Ubuntu.

  7. #7
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    I think you may be mis-understanding the Local option. I haven't looked at the source-code, but in MySQL "Local" means that the data is on the client machine relative to MySQL (this means the PDI machine). A FIFO Pipe *would* be on the PDI machine, not in the cloud.

    So:
    Do CSV Input from your Cloud source, and connect it to your MySQL Bulk Load. Leave the FIFO file as /tmp/fifo. Leave the Local data box checked. Try running it. Does the data load?

  8. #8
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Are you trying to use the JOB MySQL bulk Load step, or the Transform MySQL Bulk load step?
    They behave slightly differently.

    NOTE: I'm not a MySQL user.

  9. #9
    Join Date
    Dec 2015
    Posts
    10

    Default

    Thank you so much for the explanation.

    With a Text file input (as CSV Input shows same error message as it expects file to be local to where the job in invoked) and Bulk load to MySQL works. But it requires another step as text file input, which is not the desired one.

    There are 2 mysql bulk loader steps available (one in job level; which doesn't require any file input, and the one we were intended to use); and another in transformation level - it requires file input. In the 1st one, fifo option is not available to be modified. So assuming it will always use client m/c relative to mysql for fifo pipe. Even the "Replace data" also works in different way as it doesn't replace same value. Anyway, thats separate discussion thread.

    Wondering why 2 separate steps are there to perform similar task; and behaves slightly different.

    Again, much appreciate your thought and suggestion.

    -Mallika

  10. #10
    Join Date
    Dec 2015
    Posts
    10

    Default

    By the time I managed to form my response; you already had replied . I was using job level as the process creates table dynamically in run time; the job level one was making the solution considerably simpler.

    And, I am very new to Pentaho.

    Thanks.
    Last edited by mallika; 01-12-2016 at 04:16 PM.

  11. #11
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    Quote Originally Posted by mallika View Post
    Wondering why 2 separate steps are there to perform similar task; and behaves slightly different.
    Because Jobs and Transforms perform different roles.

    You should be able to use the Job step, however, it requires that a "real" file exist, not something like http://cloudserver/path/filename.csv So you need to transfer the file to either the PDI machine or the MySQL machine first (hint: there's a job step called "Copy files" that copies files). Then you can specify the path name.

    Quote Originally Posted by mallika View Post
    always use client m/c relative to MySQL
    What do you mean by m/c ?

  12. #12
    Join Date
    Dec 2015
    Posts
    10

    Default

    I am now thinking about preparing 2 solutions - one with transformation level bulk load and another with copy file to PDI/MySQL server and using job level one. Whatever provides better performance and flexibility for multiple large files; would opt for that.

    I meant machine by m/c. During Mechanical engineering days, we used m/c always for machine; not able to get rid of college days' laziness.

    Thank you.
    Mallika

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.