Mozdeh

Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

Importing text data into Mozdeh

Mozdeh can import data as long from any source as long as it is in a standard plain text format or one of the other pre-defined formats.

To import data, start Mozdeh, enter a project name and then click Import Data (instead of New Project). You will then be asked a series of questions about your data. This page describes the format required for Mozdeh to recognise text.

For general texts, your data must be in one or more files as follows:

EITHER

A1) Plain text, tab delimited format AND
A2) The first line must contain the names of the columns AND
A3) One column must contain text data AND
A4) One column must contain the date in standard format, such as: 2010-12-01 or Mon Apr 04 20:13:22 +0000 2016 AND
A5) One column must contain a text label for the data, such as a topic. These can be the same for all or some of the texts.

OR

B1) Plain text, one text on each line and no tabs anywhere.

Hints for A:

If your data is in a different format, it may be possible to load it into Excel and then choose Save As and then Text (tab delimited) as the format.

If your time is in a different format, it may be possible to load it into Excel and then make a new column with the first date format above using the formula below, hwere A2 is the Excel location of the original date.

=CONCAT(YEAR(A2),"-",MONTH(A2),"-",DAY(A2))

Hints for B:

For B choose the plain text only import option in the Import Data Wizard (probably number 11).

All texts will be given today's date as their date.

The label, author and URL of each text will be the filename. If you have different sets of texts that you want to compare, put them in different files in the same folder, give each file a meangful name before importing into Mozdeh. The different files can be compared through Mozdeh's label function.

For this option, the author and date functions will not be useful.

Made by the University of Wolverhampton during the CREEN and CyberEmotions EU projects and updated at the University of Sheffield.