Mozdeh

Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

Mozdeh: Frequently Asked Questions - These no longer apply to Twitter (April 2023, except for the Academic API)

Can Mozdeh collect or analyse tweets?

Does Mozdeh continue collecting data if the computer/laptop goes into Sleep or Hibernate mode?

What happens to the data if Mozdeh crashes or if there is a power cut?

Can I merge two projects?

Can Mozdeh find YouTube comments containing timestamps?

Can Mozdeh create a subset of texts based on date?

Can Mozdeh create networks of texts from a given date range?

Can I preview a Mozdeh project while it is collecting data?

How accurate is Mozdeh's gender detection?

Can I create a copy of my project with different time slicing (day/hour/month)?

Why do I see #NAME in my data?

Why does my YouTube key not work or why has it stopped working?

Can I find nonbinary users?

Can I see user self-descriptions?

Can I count word frequencies per user instead of per text (e.g., for Twitter timeline data)?

Why am I getting a proxy server error?

Why do I always get a file busy error when running Mozdeh?

Answers

Can Mozdeh collect all tweets relevant to a topic that is in the future or less than a week old?

No it can't collect tweets, except perhaps if you have paid to access the Twitter/X API (not tested). If you collected tweets with Mozdeh in the past then it should still work with them except that it will not be able to detect user nationalities (this requires API calls).

What happens to the data if Mozdeh crashes or if there is a power cut?

If Mozdeh crashes or there is a power cut then the data is not lost but will be saved in a file called something like TwitterSearches_Tweets.txt and stored within a subfolder of the project folder within moz_data called raw data. So the full path for the file might be something like c:\moz_data\SNP MP test\raw data\TwitterSearches_Tweets.txt. This will contain all of the data except perhaps a few tweets from the last few minutes before the crash. If you restart Mozdeh and select the project then it will process this file as a normal project. See below if you want to merge projects from two or more TwitterSearches_Tweets.txt files due to Mozdeh crashes or power cuts. You can only merge projects after the data collection has finished.

Can I merge two projects?

To create a single combined Mozdeh project for more than one set of posts collected by Mozdeh, then download Webometric Analyst, start it and close the startup Wizard. From the main search interface, select the Text menu, the Merge files submenu and the option Merge any number of text files (simple consecutive merge, no checking). When asked, reply Yes to the question about ignoring header lines after the first one. These files are inside the raw_data subfolder of the main project folders in moz_data. So the full path for the file might be something like c:\moz_data\SNP MP test\raw data\TwitterSearches_Tweets.txt. Select all the different Mozdeh TwitterSearches_Tweets.txt files, one at a time, and choose any name for the merged file. One the merged file is ready, export it back to Mozdeh using the Webometric Analyst button Convert Twitter Files to Mozdeh Format in the Twitter tab on the main interface. After this, start the new project in Mozdeh and it will process it.

Does Mozdeh continue collecting data if the computer/laptop goes into Sleep or Hibernate mode?

No. When the computer goes into sleep (or hibernate) mode, all programs stop running, including Mozdeh, so it would not be able to collect any data until it is brought out of sleep/hibernate mode. When Mozdeh is woken from sleep it can carry on as normal without a warning that it was asleep.

Mozdeh will work with the screen turned off, so it is safe to configure your computer power management settings to switch the display off after 30 minutes of inactivity as long “never” is set as the time period before going into sleep/hibernation mode (The “Put the Computer to sleep:” Power Option setting).

Can Mozdeh find YouTube comments with timestamps?

In the search screen, check "Timestamps only" and click Search. This returns only comments containing something that looks like a timetamp, such as 5.43. Timestamps are not shown in the search results list but can be seen for individual results by clicking on them.

Can I create a subset of my existing texts based on date?

It is tricky to create a subproject with a specified date range but with this one you can create a new project with the old data and select a range.

  1. Start Mozdeh, enter a new project name and then click Import Data.
  2. Select 1 as the type of data to import and leave all the settings unchanged in the next dialog box except the start and end day.
    • To enter the start day click the first yellow box and select a data in the calendar.
    • To enter the end day click the first yellow box and select a data in the calendar.
  3. Copy the UserNames_Timelines.txt or TwitterSearches_Tweets.txt file from its original folder (called something like C:\rss_data\OriginalProjectName\raw data) into a new empty Windows folder and point Mozdeh to this new folder when it asks.

Can I create networks of texts from a given date range?

For this one, you will need to use the original project, not a date filtered new project. After opening the original project, select "Make new raw data file with date restrictions" from the data menu and choose the filtered raw data file and the date range you want. This will save a date-restricted copy of the raw data file in the same place as the original one.
Now when you select the network creation option select the new filtered raw data file rather than the original one and you should get a network with data only from these dates.

Does Mozdeh store the Tweet URLs?

The original link is not saved anywhere, but if the tweet is still live then you can find it by adding the tweet Entry ID (second column of raw data file) to the end of the URL https://twitter.com/statuses/ and it should redirect the standard URL. For example,
https://twitter.com/statuses/862336612591689733 redirects to https://twitter.com/CamOpenAccess/status/862336612591689733

Does Mozdeh store the gender/sentiment/country information with the texts?

The gender information is only in the interface version, sorry. Once it has been examined in the interface, the gender information is saved in a file called genderinfo.txt in the main project folder, which matches up with the ID numbers, in case this helps. The same is true for sentiment in the "Item and Feed IDs" folder. There is a "Add country code" option in the File menu of Mozdeh, which might be useful.

Can I preview a Mozdeh project while it is collecting data?

Yes - click the Make copy of project button towards the bottom of the data collection screen. You will need to start a second copy of Mozdeh, which can run at the same time as the first one, and open the project copy to see it.

Can I create a copy of my project with different time slicing (day/hour/month)?

Yes but it is a bit tricky.

Why is REPLY @user in the Mozdeh tweet but not in the original tweet?

If a tweet is recorded by Twitter as a reply to @user1 but @user1 is not mentioned in the tweet then Mozdeh adds REPLY @user1 to the start of the tweet at data collection time so that these replies can be included in network analyses.

Why do I see #NAME in my data?

If you view YouTube video comments in Excel then any line (comment or videoID) starting with a minus sign is interpreted as a "bad formula name" by Excel. To get round this problem, start a new copy of Excel, Right click in the top left hand corner of a worksheet, select format and Text. This converts all cells of the worksheet to expect text and not try to convert anything into a formula. If you copy and paste your Mozdeh data into here then it should no longer produce #NAME anywhere. If you still get #NAME then it is possible that (a) you have previously saved the file with Excel or (b) your computer is configured to process text files through Excel in some way, even though it looks like you are not using Excel. For problem (a) you would have to re-collect the data, but for (b) you might have to try a different computer.

->

Why does Mozdeh get truncated retweets, limited to 140 characters?

In the 20 June 2020 upgrade, Mozdeh should gather full tweets in all cases. It previously truncated retweets.

Why does my YouTube key not work or why has it stopped working?

YouTube keys suddenly stopping working seems to occur a lot and may be caused by a YouTube glitch. Try logging on to the Google Developer platform, creating a new project, adding the YouTube Data API v3 to the new project and generating credentials for it (in that order). This new key might work. This almost always works for me.

Can I find which country users are from?

This was possible for Twitter but not for YouTube or other sources.

Versions of Mozdeh from July 2020 onwards can detect the country of Twitter users (see the menu option: Advanced|Get countries of Twitter users...). It does this by retrieving the location information from Twitter (may require a Twitter logon) and matching it against a list of country names in English, and a list of major city names in English. For example, a user location of Wolverhampton, UK would map to UK and a user location of Beijing would match to China. But a user location of My Home, 中国, or Wombourne (small village) would map to None.

Can I find nonbinary users?

This was possible for Twitter but not for YouTube or other sources becuase nonbinary identities cannot be guessed from names and pronouns are not systematically recorded in YouTube and Reddit.

Mozdeh could detect nonbinary Twitter users (see the menu option: Advanced|Identify nonbinary, male, female...; you many need to activate Advanced| Get countries of Twitter users... first). It did this by retrieving the user self-description information from Twitter for all users in the current project (may require a Twitter logon) and searching the display name and self-description fields. Any user reporting they/them pronouns in either field and not she/her or he/him is categorised as nonbinary.

Can I see user self-descriptions?

This is possible for Twitter but not for YouTube or other sources.

Versions of Mozdeh from July 2020 onwards can show user descriptions when clicking on a search result. This is only possible after loading the countries of search results (this also downloads the description information). To load countries, see the menu option: Advanced|Get countries of Twitter users.... It does this by retrieving the user self-description information from Twitter for all users in the current project and reporting the self-description fields. After loading countries, load the self-descriptions (see the menu option: Advanced|Report user descriptions...).

Can I count word frequencies per user instead of per text (e.g., for Twitter timeline data)?

Yes but you will need a new project, importing your old data into it and selecting the option, Merge all texts from the same user. Here is how it works for Twitter timelines.

Copy the UserNames_Timelines.txt file into a new folder where it is on its own.
Start Mozdeh, enter a new project name and click the Import data button.
Enter number 1 (Twitter) for the data type and click OK.
Browse for the folder containing one file with all the tweeters' tweets and click OK.
Select Merge all texts from the same user at the bottom of the massive dialog box and click OK.
Click OK for all the other dialog boxes to accept the results.

This should give a new project in which each doctor has one "megatweet" consisting of all their tweets merged into one. The file vocabulary_items.txt in the project folder should then report the number of users tweeting each word at least once.

Why am I getting a proxy server error?

If you get an error message about proxy permissions when running Mozdeh from a work computer, then please either get your network administrator to allow Mozdeh to access the internet or run it from a non-work computer, such as from home, if you can.

Why do I always get a file busy error when running Mozdeh?

If you get a file access error and are running Mozdeh from a network drive, the cloud or storage not on your computer, please try running it from a USB stick attached to your computer because network delays can cause it problems.

Made by the University of Wolverhampton during the CREEN and CyberEmotions EU projects and updated at the University of Sheffield.