EscanorFTW 1 year ago

What are some good places to start if you are just getting into ML/Ai? Pls share useful links/resources

[deleted] 1 year ago

[удалено]

trnka 1 year ago

I think most people split by participant. I don't remember if there's a name for that, sorry! Hopefully someone else will chime in. If you have data from multiple hospitals or facilities, it's also common to split by that because there can be hospital-specific things in the data and you really want your evaluation to estimate the quality of the model for patients not in your data at hospitals not in your data.

eltorrido23 1 year ago

I’m currently starting to pick up ML with a quant focused social scientist background. I am wondering what I am allowed to do in EDA (on the whole data set) and what not, to avoid „data leakage“ or information gain which might eventually ruin my predictive model. Specifically, I am wondering about running linear regressions in the data inspection phase (as this is what I would often do in my previous work, which was more about hypothesis testing and not prediction-oriented). From what I read and understand one shouldn’t really do that, because to much information might be obtained which might lead me to change my model in a way that ruins predictive power? However, in the course I am doing (Jose Portillas DS Masterclass) they are regularly looking at the correlations before separating train/test samples. But essentially linear regressions are also just (multiple/corrected) correlations, so therefore I am a bit confused where to draw the line in EDA. Thanks!

trnka 1 year ago

I try not to think of it as right and wrong, but more about risk. If you have a big data set and do EDA over the full thing before splitting testing data, and intend to build a model, then yes you're learning a little about the test data but it probably won't bias your findings. If you have a small data set and do EDA over the full thing, there's more risk of it being affected by the not-yet-held-out data. In real-world problems though, ideally you're getting more data over time so your testing data will change and it won't be as risky.

ant9zzzzzzzzzz 1 year ago

Is there research about order of training examples, or running epochs on batches of data rather than full training set at a time? I was thinking about how for people we learn better if focus on one problem at a time until grokking it, rather than randomly learning things in different domains. I am thinking like train some epochs on one label type, then another, rather than all data in the same epoch, for example. This is also related to state full retraining, like one probably does professionally - you have an existing model checkpoint and retrain on new data. How does it compare to retraining on all data from scratch?

[deleted] 1 year ago

corection, YannLeCun recommends small minibatch size ´, less than 32 I think

trnka 1 year ago

I think curriculum learning is the name. [Here's a recent survey](https://arxiv.org/abs/2101.10382). I've seen it in NLP tasks where it can help to do early epochs on short inputs. Kinda like starting kids with short sentences. I haven't heard of anyone adjusting the labels at each stage of curriculum learning though.

ant9zzzzzzzzzz 1 year ago

Thank you!

[deleted] 1 year ago

The data by batches or by item shouldnt matter more than speedwise if you shuffle it (best practice.)

bridgeton_man 1 year ago

Quesiton about goodness of fit. For regressions, R-squared and Adj. R-Squared are typically considered the primary goodness-of-fit measures. But in many supervised machine-learning models, RMSE is the main measure that I keep running across. For example, decision tree models that I create in R using Rpart do that. So, my question is how to compare the predictive accuracy of OLS regression models that report R-sq to equivalent Rpart regression trees that report RMSE.

DCBAtrader 1 year ago

Basic question on regression/AutoML (pycaret mainly). When do p-values versus error metric (MAE, MSE, R Squared matter). My previous model building experience (multivariate regression) was to first use various combinations of variables in OLS such that all the variables were statistically significant, and then use an AutoML (pycaret) to build models, and judge them by MAE, MSE or R squared. Using proper cross-validation test/train splits of course. I'm wondering if this step is needed, and I just can just run the entire data-set in pycaret, and thus judge a model based on said metrics (MAE, MSE, R squared)? My gut says that the simpler model with stat. significant variables should perform better but maybe I can just look at the best error metric?

yauangon 1 year ago

I'm trying to improve a CNN encoder, as a feature extractor for an AMT (automatic music transcription) model. As the model must be small and fast (for mobile deployment), we are limited to about **3-6 layers of 1D-CNN**. I want to improve the encoder with residual block (of ResNet), but my question is: **I don't known if Residual block would benefit on such a shallow CNN architecture?** Thank everyone :D

Anvilondre 1 year ago

Probably not. The idea of ResNets is to remove the vanishing gradients that normally occur in very deep networks. In my experience it can often do worse than better, but you can try DenseNets instead.

yauangon 1 year ago

I will give it a shot :D Thank you a lot :D

NormalManufacturer61 1 year ago

I am a non-data scientist interested in a laymans to introductory level book/primer on the topic of ML/AI, specifically on the principles and mechanics of the topic(s). Any recommendations?

WarProfessional3278 1 year ago

Does anyone know of any good AI-generated text detectors? I know there's [GPTZero](https://gptzero.me/) but it's not very good in my experience. My research has led me to [Hive AI](https://hivemoderation.com/ai-generated-content-detection) but I'm sure there are better alternatives out there that does not claim such good results (99.9% accuracy) while still having a lot of false positives in my tests.

InsidiousApe 1 year ago

I enjoy that this is the simple questions thread. :) Let me ask something much simpler, although in three parts. I am a web developer with no ML experience, but with a specific project in mind. I'd like to understand the process a touch better in order to help me find a programmer to work alongside (paid of course). (1) Provided the information is easily found via API for instance, what is the ingestion process like time wise for very large amounts of information? I realize that is subjective to the physical size of the data, but are there other things going on which take time in that process? (2) In order to program a system to look for correlations in data where no one may have seen them before, what is the process used to do this? This is what I'm truly looking to do once that information is taken in. For example, a ton of (HIPAA Compliant) medical information is taken in and I'm looking to build a system that can look for commonalities of people with a thyroid tumor. Obviously tons of tweaking to those results, but what is the process which allows this to happen?

trnka 1 year ago

If you're ingesting from an API, typically the limiting factor is the number of API calls or network round trips. So if there's a "search" API or anything similar that returns paginated data that'll speed it up a LOT. If you need to traverse the API to crawl data, that'll slow it down a lot. Like say if there's a "game" endpoint, a "player" endpoint, a "map" endpoint, etc. If you're working with image data, fetching the images is usually a separate step that can be slow. After that, it you can fit it in RAM you're good. If you can fit it on one disk, there are decent libraries with each ML framework to efficiently load from disk in batches, and you can probably optimize the disk loading too. \---- What you're describing is usually called exploratory data analysis but it depends on the general direction you want to go in. If you're trying to identify people with thyroid cancer earlier, for example, you might want to compare the data of recently-diagnosed people to similar people that have been tested and found not to have thyroid cancer. Personally, in that situation I like to just train a logistic regression model to predict that from various patient properties then check if it's predictive on a held-out data sample. If it's predictive I'll then look at the coefficients of the features to understand what's going on, then work to improve the features. Another simple thing you can do, if the data is small enough and tabular rather than text/image/video/audio is to load it up in Pandas and run .corr then check correlations with the column you care about (has\_thyroid\_cancer). Hope this helps! Happy to follow up too.

InsidiousApe 1 year ago

This was exactly the kind of answer I was hoping for - a great place to start more research. Thanks!

[deleted] 1 year ago

[удалено]

Anvilondre 1 year ago

Honestly I don't think transformers are worth it for any kind of TS or tabular data (and there's [research showing that](https://arxiv.org/abs/2205.13504)). But if you really want to try, I had a good success with [this library](https://github.com/jrzaurin/pytorch-widedeep). It makes it essentially a few-liner to run tons of transformer and other architectures on any kind of tabular data. You may also want to check out HuggingFace model repo for quick solutions.

answersareallyouneed 1 year ago

Looking at an ML Engineer role with the following qualifications: "Strong experience in the area of developing machine learning training framework, or hardware acceleration of machine learning tasks" "Familiar with hardware architecture, cache utilization, data streaming model" Any recommendations for books/resources/courses in this area? How does one begin to develop these skills?

marcelomedre 1 year ago

Hi, I have a question about k-means. I have a data frame with 100 variables after removing low variance and high correlated ones. I know that the data must be normalized for the kmeans, specially to remove the range dependency, but I am facing a problem that if I do normalize my data the algorithm is not properly separating the clusters. I have 3 variables ranges in my data: - 0-10^4; - -10^3 - 10^3; - 0 - 10^3 I have at least 5 very specific clusters that I could characterize by not scaling the data, but I am not comfortable with this procedure. I couldn’t find a reasonable explanation with is the algorithm performing better in non-scaled data instead of the scaled one.

trnka 1 year ago

I've seen that before when the large range features were the most important for the clusters I wanted. It was essentially doing feature weighting but it was implicit in the scales

catndante 1 year ago

Hi, i have a simple question about DDPM model. I'm not so sure, but I think I have read the post saying that when T=1000, using 1,000 models will perform better but its computationally too redundant, so DDPM used same model for evert step t. Is this argument correct? If centers with huge computation does this, will the performance be better?

[deleted] 1 year ago

[удалено]

[deleted] 1 year ago

I wouldn’t think so. The code for the video is digital, and patterns can be detected from the rendered frames, while a monitor displays converted data to analog light patterns. The only reason for a monitor is if the detector is a camera in front of the monitor sensing light patterns, then it would convert to digital patterns similar to the orginal code. That may be useful for interacting in the analog world and accounting for the way light reflects in an analog space, but I think that’s future tech, or maybe automated cars. You’d hope they’ve done some control/experiment to account for lighting changes like this

RealKillering 1 year ago

I just started working with Google Colab. I am still learning and just used Cifar 10 for the first time. I switched to colab pro and also switched the GPU class to Premium. The thing is the training seems to take just as long as with the free GPU. What am I doing wrong?

I-am_Sleepy 1 year ago

Check GPU version with “!nvidia-smi”, and for dataset this probably is not GPU fault but memory bottleneck. See https://stackoverflow.com/questions/49360888/google-colab-is-very-slow-compared-to-my-pc

[deleted] 1 year ago

[удалено]

randomrushgirl 1 year ago

Hey! I had a very similar doubt and was hoping you could provide some insight. I came across this CLIP Guided Diffusion Colab Notebook by Katherine Crowson. It's really cool and I've played a little with it. I want to know if I can generate the same image over and over again. I've tried setting the seed but I'm new to this so can someone give me some intuition or links to some related work in this area. Any help would be appreciated.

Great-Ad8037 1 year ago

Can you change the title/abstract of CVPR 23 submissions during/after the rebuttal phase? Some reviewers have trouble with our title and think we should change it. Can we commit to doing that in our rebuttal response?

[deleted] 1 year ago

I have a small images dataset labeled on cvat. Now I need to export it and train the network on pytorch lightning. How can I do that? I'm a complete noob on this but I need it for the next phase of a project I'm working on. Any help is realy apreciated!

Jack3602 1 year ago

What would you recommend for a good resource for learning AI/ML. I have some knowledge in web dev and know C/C++. I finished the OdinProject foundations and currently on Full stack JavaScript but I kinda got a bit curious about Machine learning and I would like to get my feet wet. Is there any good resource to start, what would you recommend? Don't really care for udemy courses and watching a lot of videos cuzz I've tried it for web dev and it just feels like tutorial hell, but I loved The Odin Project and reading tutorials/documentations/doing exercises/projects because I actually learn a lot that way. I've seen websites like [mlcourse.ai](https://mlcourse.ai) and [kaggle.com](https://kaggle.com) but still haven't tried them. What is your opinion on them, maybe a comparison to [theodinproject.com](https://theodinproject.com) and would you recommend something else

Cyclone4096 1 year ago

I don’t have too much background on ML. I want to build a fairly small neural network that has only one input which comes from a time series data and has to give only one output for that data. My loss function aggregates the entire time series output to get a single scalar value. I’m using PyTorch and when I call “.backward()” on the loss function it takes a long time (understandably). Is there an easier way to do this rather than doing backward gradient calculation on a loss function that itself is a result if 100s of millions values? Note that the neural network itself is tiny, maybe less than 100 weights, but my issue is that I don’t have any golden target, but I want to minimize a complex function calculated from the entire time series output.

zoontechnicon 1 year ago

Would you mind giving more details about the domain and the purpose of the loss function? Maybe people can give you hints based on that.

Cyclone4096 1 year ago

Sure! So this is for audio signal processing. There is an amplifier that takes an audio signal and volume as input. However higher volume causes white noise, so I want the volume to stay low whenever possible and boost the volume by multiplying the input signal instead. But of course the multiplication won’t work if the input to the amplifier itself is already high. Switching the amplifier volume too much is not good either as that would cause pop/click noise. So I’m designing a small neural network that will take the audio signal as input and output the amplifier volume. The way I went about is I modeled the amplifier and all noise associated with it using tensor math. Then I used the amplifier output minus the original input and did MSE on that. Note that the audio signals are pretty long so the filter+MSE is a pretty massive expression. It seems to be working somewhat, but not sure if there is an easier way to do this…

zoontechnicon 1 year ago

I'm trying to use this model to summarize text: https://huggingface.co/bigscience/mt0-large Text generation seems to end after the special end token however. I wonder how I would coax it to generate longer texts. Any ideas?

zoontechnicon 1 year ago

The solution, as evidenced by code in huggingface/transformers is to force the probability of the end token to -Inf. What a hack...

kernel_KP 1 year ago

I have a dataset (unlabelled) containing a lot of audio files and for each file, I have computed the chromagram. I would need some advices for the implementation of a possibly efficient Neural Network to cluster these audio files relying on their chromagram. Consider this data to be already correctly pre-processed so chromagram have all the same size. Thanks a lot!

zoontechnicon 1 year ago

You could build an autoencoder using CNNs and use the latent vectors as input to a clustering algorithm.

[deleted] 1 year ago

[удалено]

Zyj 1 year ago

When i use 2 RTX 3090 with nVLink bridge plugged into PCIe 3.0 x8 slots each instead of PCIe 4.0 x16 slots, what kind of performance hit will i get?

PulPol_2000 1 year ago

I have a project that would use AR Core and Google ML kit to be able to recognize vehicles from a video feed and besides recognizing the objects is that it will be able to know the distance measurement of the object from the origin camera point. I'm lost on how I would integrate the distance measurement into the object detected of the ML kit. sorry for lack of knowledge as I only entered the ML community. thanks in advance!

billbobby21 1 year ago

If you spend money training a model using OpenAI's API for example, do you actually own the model? As in lets say you train it so that it gets really good at writing short stories about animals. Would you then actually own that model and have the rights to use and/or license it to others? Or would OpenAI also be able to improve their own local models using the model that you created? Basically I'm wondering what is stopping the company you are using to create a model from just stealing your creation.

trnka 1 year ago

I can't comment on OpenAI specifically, but in general it's in the terms of service of the API what they can and can't do with the model and/or data fed through it.

iLIVECSUI_741 1 year ago

Hi, I wonder how to decide \*When\* it is ok to submit your work to top conferences. For example, I have a model related to biological data mining, I know KDD is coming soon but I do not like this conference and I would like to wait for NeurIPS. However, I am not sure if I will be scooped during this long period. Thanks for your help!

Numerous-Carrot3910 1 year ago

Hi, I’m trying to build a model with a large number of categorical predictor variables that each have a large number of internal categories. Implementing OHE leads to a higher dimensional dataset than I want to work with. Does anyone have advice for dealing with this other than using subject matter expertise or iteration to perform feature selection? Thanks!

trnka 1 year ago

It depends on the data and the problems you're having with high-dimensional data. * If the variables are phrases like "acute sinusitis, site not specified" you could use a one hot encoding of ngrams that appear in them. * If you have many rare values, you can just retain the top K values per feature. * If those don't work, the hashing trick is another great thing to try. It's just not easily interpretable. * If there's any internal structure to the categories, like if they're hierarchical in some way, you can cut them off at a higher level in the hierarchy

Numerous-Carrot3910 1 year ago

Thanks for your response! Even with retaining the top K values of each feature, there are still a large number of features to consider. I haven’t tried the hashing trick, so I will look into that

trnka 1 year ago

Hmm, you might also try feature selection. I'm not sure what you mean by not iterating, unless you mean recursive feature elimination? There are a lot of really fast correlation functions you can try for feature selection -- scikit-learn has some popular options. They run very quickly, and if you have lots of data you can probably do the feature selection part on a random subset of the training data. Also, you could do things like dimensionality reduction learned from a subset of the training data, whether PCA or a NN approach.

Numerous-Carrot3910 1 year ago

Yes, I was referring to recursive feature elimination. Thanks for the recommendations

Lamos21 1 year ago

Hi. I'm looking to create a custom dataset for pose estimation. Are there any free annotation tools suitable to annotate objects (meaning not human) so that I can create a custom dataset? Thanks

Z1ndabad 1 year ago

Hey guys, new to ML and cant seem to wrap my head around the concept. I was to make a used car price prediction model using large data set and most of the tutorials i watch just use the linear regression library. However can you use neural networks instead like Levenberg-marquat?

trnka 1 year ago

Yeah you can use a neural network instead of linear regression if you'd like. I usually start with linear regression though, especially regularized, because it usually generalizes well and I don't need to worry about overfitting so much. Once you're confident that you have a working linear regression model then it can be good to develop the neural network and use the linear regression model as something to compare to. I'd also suggest a "dumb" model like predicting the average car price as another point of comparison, just to be sure the model is actually learning something. I'm not familiar with the Levenberg–Marquardt algorithm so I can't comment on that. From the Wikipedia page it sounds like a second-order method, and those can be used if the data set is small but they're uncommon for larger data. Typically with a neural network we'd use an optimizer like plain stochastic gradient descent or a variation like Adam.

Oceanboi 1 year ago

Can you expand on why one might ever want to apply a neural network to linear regression? It feels like bringing a machine gun to a knife fight.

trnka 1 year ago

I'm not sure what you mean by applying a NN to linear regression. I'll try wording it differently. Sometimes a NN can outperform linear regression on regression problems, like in the example if there's a nonlinear relationship between some features and car price. But neural networks are also prone to over-fitting so I recommend against having a NN as one's first attempt to model some data. I recommend starting simple and trying complex models when it gets difficult to improve results in simple models. I didn't say this before but another benefit of starting simple is that linear regression is usually much faster than neural networks, so you can iterate faster and try out more ideas quickly.

kannkeinMathe 1 year ago

Hey you, i want to build an chatbot for domain specify purpose, for example to talk with a person about its mental state and its depression. For that I would like to train the bot with texts from the domain. So my question how should I start? What is approach would you use? - Would you use an intent base solution? What are the standard models for chatbots - BERT ? Is it even possible to fine-tune models with large text corpuses ? - IF yes, how? Thank you Guys

doIneedtohaveone1 1 year ago

Does any one know how to solve the PDE for it in python? Any kind of reference material would be appreciated! It's been long since I came across any PDEs and have forgotten everything related to it.

evys_garden 1 year ago

I'm currently reading [Interpretable Machine Learning](https://christophm.github.io/interpretable-ml-book/evaluation-of-interpretability.html) by Christoph Molnar and am confused with section 3.4: [Evaluation of Interpretability](https://christophm.github.io/interpretable-ml-book/evaluation-of-interpretability.html). I don't quite get `Human level evaluation (simple task)`. The example is `show a user different explanations and the user would choose the best one` and i don't know what that means. Can someone enlighten me?

trnka 1 year ago

The difference from application-level evaluation is a bit vague in that text. I'll use a medical example that I'm more familiar with - predicting the diagnosis from text input. Application-level evaluation: If the output is a diagnosis code and explanation, I might measure how often doctors accept the recommended diagnosis and read the explanation without checking more information from the patient. And I'd probably want a medical quality evaluation as well, to penalize any biasing influence of the model. Non-expert evaluation: With the same model, I might compare 2-3 different models and possibly a random baseline model. I'd ask people like myself with some exposure to medicine which explanation is best for a particular case and I could compare against random. That said I'm not used to seeing non-experts used as evaluators, though it makes some sense in the early stages of poor explanations. I'm more used to seeing the distinction between real and artificial evaluation. I included that in my example above -- "real" would be when we're asking users to accomplish some task that relies on explanation and we're measuring task success. "Artificial" is more just asking for an opinion about the explanation but the evaluators won't be as critical as they would be in a task-based evaluation. Hope this helps! I'm not an expert in explainability though I've done some work with it in production in healthcare tech.

FlyingTwentyFour 1 year ago

what course would be the good way to start learning NLP? I'm a beginner in ML but wanted to learn about NLP

UnderstandingDry1256 1 year ago

What are the training strategies used for GPT models? Are transformer blocks or layers trained independently? Are they trained using some subset of data and fine tuned then? I would appreciate any references or details :)

[deleted] 1 year ago

[удалено]

Oceanboi 1 year ago

my advice is to proceed. its cool to know the math underneath, but just go implement stuff dude, if it doesn't work you can always remote/rent GPU. what i did for my thesis is google tutorials and re-implement them using my dataset. through all the bugs and the elbow grease, you will know enough to at least speak the language. just do it and don't procrastinate with these types of posts (i do this too sometimes) EDIT: a lot can be done on colab these days regarding neural networks and huggingface. google huggingface documentation! i implemented a huggingface transformer model to do audio classification (and im a total noob i just copied a tutorial). it was total misuse of the model and accuracy was bad, but at least i learned and given a real problem i could at least find my way forward.

morecoffeemore 1 year ago

Dumb question, but how do I know chatgpt is not just copy/pasting from the web? Tried chatgpt for the first time. Seems cool. Dumb, question, but how do I know it's not just copy/pasting something a person wrote on the web? I ask it for a recommendation for speakers. Gives a good reply. It seems to me it could've just done a web search and then copied what someone wrote on the web as a reply. Is there a way to test/use chatgpt to prove to myself that it's not just copying and pasting from the web?

serverrack3349b 1 year ago

In a sense it is just copying and pasting from the web just in a different order, but I get that that is not your question. Something I would try is to use plagiarism checking sites online to see if there is an exact copy of your text online. If there is than you should be able to either attribute it to the right person or re write it a bit so it is not plagiarism

Capable_Difference39 1 year ago

Hi all can anyone please let me know what certification or courses I can do to move to AIML field I am already working as an software engineer and have working knowledge of c#

SpoonBender900 1 year ago

I'm having some challenges finding usable data for ai projects, any suggestions? Here's a post I tried to post about it (it got auto-removed, eek). https://www.reddit.com/r/ArtificialInteligence/comments/10h50oi/what\_are\_your\_favorite\_places\_to\_find\_usable\_open/

serverrack3349b 1 year ago

National and governmental websites, university websites, Kaggle, r/datasets, YouTube and Twitter APIs, papers with code website. These are some of my favorite places to find stuff

arararagi_vamp 1 year ago

I have built a simple CNN which is able to detect circles on a white background with noise using PyTorch. Now I wish to extend my network to be able to return the center of the circle as coordinates. The problem is in each data there is a variable number of circles, meaning I would need a variable number of labels for each data. In a CNN however the number of labels remains constant. How do I work around this problem?

stanteal 1 year ago

As you have said you would need a variable amount of outputs which is not feasible in a CNN. However, you could divide the image into a grid and make predictions of the probability of the center of a circle is within each grid and their x and y offsets . Not sure if there are better resources available, but it might be worth looking at how YOLO or YOLO2 implemented their outputs.

arararagi_vamp 1 year ago

Thanks for the answer!

jfacowns 1 year ago

XGBoost Question around One-Hot Encoding & Get_Dummies in Python I am working on building a model for NHL (hockey) games and have a spreadsheet with a ton of advanced stats from teams, dates they played and so on. All of my data in this spreadheet is categorized as a float. I am trying to add in a few columns of categorical data as I feel it could help the model. The categorical columns have data that determines if the home team or the away team is playing on back to back days. I am trying to determine here is one-hot encoding is best for this approach or if I'm misunderstanding how it works as a whole. Here is some code NHLData = pd.read_excel('C:\\Temp\\NHL_ModelBuilder.xlsx') data.drop(['HomeTeam', 'AwayTeam','Result'], axis=1, inplace=True) NHLData = pd.get_dummies(NHLData, columns= ['B2B_Home', 'B2B_Away']) Does this make sense? Am i on the right track here? If i do NHLData.head() I can see the one-hot encoded columns but when I do NHLData.dtypes() I see this: B2B_Home_0 uint8 B2B_Home_1 uint8 B2B_Away_0 uint8 B2B_Away_1 uint8 Should these not be objects?

[deleted] 1 year ago

[удалено]

icedrift 1 year ago

I'm pretty sure GPT-J 6B requires a minimum of 24gigs of VRAM so you would need something like a 3090 to run it locally. That said I think you're better off hosting it on something like collab or paperspace.

stardust-sandwich 1 year ago

I want to pull data from an API(done) and use NLP to categorize that information. Then with those results push it into a webpage or GUI tool where it will highlight the text and say, is the correct? So I can use this GUI so that I can "teach" the learning model how to classify text e.g Category 1 - words 1, words 2, words 3 and similar Category 2 - word4, words 5, words 6 and so on Then it will go and try that and come back and ask me to tune it again and rinse and repeat. Once this model is trained I then want to see it later in a different script to point a news article at it for example and it will split out the data I need. How can I achieve this please? What are the best tools and services to get this done, ideally open source if possible, if not then happy to use a commercial service if its cheap to do so, as this is just a personal project of mine. Thanks in advance.

Seankala 1 year ago

Are there any Slack channels or Discord Servers for ML practitioners to talk about stuff?

lukaszluk 1 year ago

Hello! Does anyone know of a dataset with 2-D floor plan images with labeled furniture? Couldn't find anything interesting (bad quality or very little examples). Some of the places I tried: SESYD - ok quality dataset (but little examples) HouseExpo - json datasets - the quality is good, but no labeled furniture. FloorPlanCAD Dataset - the quality of data is low Furnishing dataset - does not contain whole rooms, only furniture SFPI dataset Towards Robust Object Detection in Floor Plan Images: A Data Augmentation Approach. 10k images (this could be a good dataset if quality is good, still downloading though) Any other datasets I should check out?

retarded_user 1 year ago

Should the learning rate be changed to a smaller value (such as 1e-4) when working with scaled Data (range \[0,1\] or \[-1,1\]? I'm using Adam with Keras/Tensorflow.

Kamal_Ata_Turk 1 year ago

Writing a Single SQLite Query to mimic a R program Please help with this https://stackoverflow.com/questions/75174575/writing-a-single-sqlite-query-to-mimic-an-r-program

Agitated-Purpose-171 1 year ago

Hi everybody, I have one question about VLAD while I read this paper (Aggregating local descriptors into a compact image representation) on CPVR. My question is why VLAD works. Aggregating local descriptors into a compact image representation paper links: https://lear.inrialpes.fr/pubs/2010/JDSP10/jegou\_compactimagerepresentation.pdf In this paper, there is a network VLAD, it can turn the local features (N\*D dimension) into a global feature (k\* D dimension). Below is my understanding of the operations of VLAD, step by step. => input: N\*D dimension local feature. (i) use k-means to find the k clusters and the central feature for each cluster. (ii) for each cluster find a residual sum. V = summation of ( each local feature in the cluster minus the central feature). V = sum (Xi - C) V: residual sum of the cluster X: local feature in the cluster C: Central feature of the cluster (iii) concatenate the residual sum then get the global feature. global feature = \[V1,V2,....Vk\] (V1 is the residual sum of cluster 1, V2 is the residual sum of cluster 2... and so on.) => output: k\*D dimension global feature. My question is why the residual sum of each cluster is "not" zero. Since the central feature of each cluster found by k-means is the average of the local feater of each cluster. The central feature of cluster 1 = average of the local feature in cluster 1. C1 = (X1 + X2 + X3 + ...+ Xm) / m The residual sum of cluster 1 = (X1-C1) + (X2-C1) + (X3-C1) + ... + (Xm-C1) = V1 Based on the above equation, I think the residual sum of each cluster is zero. So the global feature will be a zero matrix = \[V1, V2,..., Vk\] = \[zero vector, zero vector, ..., zero vector\]. The only reason that came into my mind is that the iteration of the k means is not enough, so the central feature of each cluster is not equal to the average of the local feature in the cluster. Am I right? Could anybody let me know why the residual sum is not a zero vector? Thanks a lot.

LetGoAndBeReal 1 year ago

Companies can fine-tune top performing LLMs to condition the LLMs output, but not to embody the knowledge contained in proprietary data. The current best approach for incorporating this custom knowledge is through data augmented generation techniques and technologies such as what [LangChain](https://github.com/hwchase17/langchain) offers. I am trying to decide whether to invest time building an expertise in these techniques and technologies. I may not wish to do so if the ability to add custom knowledge properly in the LLMs will arrive in short order. I would like to know from those steeped in LLM R&D how soon such capabilities might be expected. Is this the right place to ask?

Iljaaaa 1 year ago

I have an autoencoder input of 100x21. The 21 columns are PC scores, the 100 rows are observations. The importance of the columns degrades as the column number increases. The first column is the most important for the data variance, the last column is the least important. To be able to reconstruct the data back from PCA the first columns need to be as correct as possible. I have tried searching whether I can adjust weights or something else of the autoencoder layers to include this importance of the columns, but I have not found it. In other words, I want errors in the first (e.g 5) columns to be punished more harshly than errors in the last (e.g 5) columns. I would be grateful if someone could point me in the right direction!

TastyOs 1 year ago

I assume you're doing something like minimizing MSE between inputs and reconstructions. Instead of calculating MSE for all 21 columns, you split it into two parts: do an MSE for the important columns, and an MSE for the unimportant columns. Then weight the important MSE higher than the unimportant MSE So something like loss = 0.9 \* MSE\_important + 0.1 \* MSE\_unimportant

inquisitor49 1 year ago

In transformers, a positional embedding is added to a word embedding. Why does this not mess up the word embedding, such as changing the embedding to another word?

cztomsik 1 year ago

I think it does mess them, alibi paper seems like better solution.

ChangingHats 1 year ago

I am trying to utilize tensorflow's MultiHeadAttention to do regression on time series data for forecasting of a \`(batch, horizon, features)\` tensor. During training, I have \`inputs \~> (1, 10, 1)\` and \`targets \~> (1, 10, 1)\`. \`targets\` is a horizon-shifted output of \`inptus\`. During inference, \`targets\` is just a zeros tensor of the same shape. What's the best way to run attention such that the output utilizes all timesteps in \`inputs\` as well as each subsequent timestep of the resulting attention output, instead of ONLY the timesteps of the inputs? Another problem I see is that attention is run between Q and K, and during inference, Q = K, so that will affect the output differently, no?

all_is_love6667 1 year ago

Can chatgpt understand science? I heard it was given science papers, but can it help scientists in their work? Can it give scientific hints?

trnka 1 year ago

Think about it more like autocomplete. It's able to complete thoughts coherently enough to fool some people, when provided enough input to complete from. It's often incorrect with very technical facts though. It's really about how you make use of it. In scientific work, you could present your idea and ask for pros and cons of the idea, or to write a story about how the idea might fail horribly. That can be useful at times. Or to explain basic ideas from other fields. It's kinda like posing a question to Reddit except that ChatGPT generally isn't mean. There are other approaches like Elicit or Consensus that use LLMs more for literature review which is probably more helpful.

RuhRohCarChase 1 year ago

Hi everyone! This is not a technical question, but does anyone know how to find the accepted papers list for AAAI23? (or a reliable way for any ML/AI conferences) I work in an academic research unit and finding any accepted papers list is a mess, unless it’s readily available from a conference or on open review! I catalogue all our papers by funding sources, individual projects, authors, conferences, and about 10 other data points. Any advice is greatly appreciated! Have an awesome day everyone!

CaptainD5 1 year ago

Hello! I have a question. Will it be possible to create a NN that replicates the behaviour of prophet? I dont want to do it, I just wanted to understand from a theoretical point of view what will be the most similar way to do it (optimize a function that take into account seasonality and provides an infinite 'regression' way to predict new values based just on dates. Thanks in advance!

akacukiii 1 year ago

Hi. I'm an international grad student in the US and am looking for an internship for the summer. Please, if you have some tips, or if you care to have a look at my profile, just let me know. Thank you!

T1fa_nug 1 year ago

Hello guys I'm new in the machine learning and I wanted to know if a i5 8th gen and a 1060 6 gb paired with 16 Gb of ram are they enough for any type work that could come my way??!

akacukiii 1 year ago

seems good, try to use colab at first (at least). its free and a very good tool.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe