cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Encoding issue when importing data in MEGA

aleca
Retired

Hi everyone,

 

I am working on a C# import tool to populate a MEGA repository with Org-Units, Persons and Sites.

My data are stored in a CSV file, coming from a SQL database: encoding is  "Latin1_General_CI_AS".

The issue is what follows: when I create my objects in MEGA repository all emphasized characters are changed to "?".

 

As an example:

CSV file : "Jérôme"   becomes   "J?r?me" in MEGA.

  
 
 Here is a sample of my code:
 

// Create an instance of StreamReader to read from a file.

StreamReader sr = new StreamReader(filetoimport);          <- "filetoimport" is the path to my CSV file

sr.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);

while ((temp = sr.ReadLine()) != null)

{

   I gather information here, and store it in a list called List[i]

}

sr.Close();

 
 

Then, I create objects in MEGA using my list created previously:

 

oObject = oRoot.get_GetCollection("Org-Unit", null, null, null, null, null, null).get_Create(null, null, null, null, null, null);

oObject.SetProp("Name", List[i].Name, "External");

 

There's no error or problem reported during the code execution, but every emphasized character is also replace by "?".

 

 

While doing tests, I noticed that converting CSV file to "UTF8" encoding before doing the import can help with this issue. However it is not a viable solution in my projects context.

I also think that MEGA LAB may have encountered the same kind of issue when working on SQL Repository Storage for MEGA.

 

Any help will be welcome ! Smiley Happy

 

 

ALA

 

 

3 Replies

Problem solved, thanks to François and Eric.

 

The problem came from StreamReader function: by default, chosen encoding format is set to UTF-8. We also discovered that data are not corrupted when imported in MEGA, but when the program reads the source file.

 

StreamReader sr = new StreamReader(filetoimport);

has to be replaced by:

StreamReader sr = new StreamReader(filetoimport, Encoding.getEncoding(28591));

 

"28591" is the code corresponding to "Latin1_General_CI_AS".

 

 

I really want to thank both of you for your availability.

 

 

Hi François, thank you for the answer. 

 

I think what I have in List[i].Name is correct, as I can write "Jérôme" correctly in my Excel logfile.

There is indeed a formatting problem when writing data in MEGA. My problem is I can't modify my source file, which is encoded in "Latin1_General_CI_AS", and this format is not recognized by MEGA.

 

I want to know if there is a "setProp" parameter or an out-of-the-box method to convert my source file data in something acceptable by MEGA.

As I said previously, I think the LAB MEGA has encountered the same kind of issue while developing SQL Repository Storage management.

 

François
Administrator
Administrator

Hello,

 

Basically, what MEGA receive is not correctly formated.

In your example, "Jérôme".

Mega is expecting  "6a e9 72 f4 6d 65" in Hexa for "jérôme".

You should debug your code, for example check in List[i].Name that you have the value expected.

Then you will be able to find out that during your collection, values have change.

 

See you.